Name: SVM(支持向量機)算法的工作原理 (How SVM (Support Vector Machine) algorithm works)
Uploaded: 2021-01-14T07:31:48.000Z
Duration: 7 min 33 s
Description: 【看影片學英語】數萬部 YouTube 影片，搭配英漢字典即點即查，輕鬆掌握單字發音與用法，長久累積看電影不必再看字幕。

Hello, I will explain how SVM algorithm works. This video will explain the support vector

machine for linearly separable binary sets Suppose we have this two features, x1 and

x2 here and we want to classify all this elements You can see that we have the class square

and the class rectangle So the goal of the SVM is to design a hyperplane,

here we define this green line as the hyperplane, that classifies all the training vectors in

two classes Here we show two different hyperplanes which

can classify correctly all the instances in this feature set

But the best choice will be the hyperplane that leaves the maximum margin from both classes

The margin is this distance between the hyperplane and the closest elements from this hyperplane

We have the case of the red hyperplane we have this distance, so this is the margin,

which we represent by z1 And in the case of the green hyperplane we

have the margin that we call z2 We can clearly see that the value of z2 is

greater than z1 So the margin is higher in the case of the

green hyperplane, so in this case the best choice will be the green hyperplane

Suppose we have this hyperplane, this hyperplane is defined by one equation, we can state this

equation as this one We have a vector of weights plus omega 0 and

this equation will deliver values greater than 1 for all the input vectors which belongs

to the class 1, in this case the circles And also, we scale this hyperplane so that

it will deliver values smaller than -1 for all values which belongs to class number 2,

the rectangles We can say that this distance to the closest

elements will be at least 1, the modulus is 1

From the geometry we know that the distance between a point and a hyperplane is computed

by this equation So the total margin which is composed by this

distance will be computed by this equation And the aim is that minimizing this term will

maximize the separability When we minimize this weight vector we will

have the biggest margin here that will split this two classes

To minimize this weight vector is a nonlinear optimization task, which can be solved by

this conditions (KKT), which uses Langrange multipliers

The main equations state that the value of omega will be the solution of this sum here

And we also have this other rule. So when we solve these equations, trying to minimize

this omega vector, we will maximize the margin between the two classes which will maximize

the separability the two classes Here we show a simple example

Suppose we have these 2 features, x1 and x2, and we have these 3 values

We want to design, or to find the best hyperplane that will divide this 2 classes

So we know that we can see clearly from this graph that the best division line will be

a parallel line to the line that connects these 2 values here

So we can define this weight vector, which is this point minus this other point. So we

have the constant a and 2 times this constant a

Now we can solve this weight vector and create the hyperplane equations considering this

weight vector We must discover the values of this a here

Since we have this weight vector omega here, we can substitute the values of this point

and also using this point we can substitute these 2 values here

When we place the equation g using the input vector (1,1) we know that we have the value

-1 because this belongs to the class circle So we will have this value here, when we use

the second point, we apply the function and we know that it will deliver the value 1

So we substitute here in the equation also Well, given 2 equations we can isolate the

value of omega 0 in the second equation and we will have omega 0 equal to 1 minus 8 times

a So, using this value, we put the omega 0 in

the first equation and we will reach the value of a, which is 2 divided by 5

Now we discover the value of a and now we substitute the first equation and also discover

the value of omega 0 So by dividing here we will come to the conclusion

that omega 0 is minus 11 divided by 5 and since we know that the weight vector is a

and 2 a we can substitute the value of a here and we will deliver these values of the weight

vector So in this case, these are called the support

vectors because they compose the omega value 2 divided by 5 and 4 divided by 5

And we substitute here the values of omega (2 divided by 5 and 4 divided by 5) and also

the omega 0 value we will deliver the final equation which defines this green hyperplane

which is x1 plus 2 times x2 minus 5.5 And this hyperplane classifies the elements

using support vector machines These are some references that we have used