字幕列表 影片播放 列印英文字幕 Hello, I will explain how SVM algorithm works. This video will explain the support vector machine for linearly separable binary sets Suppose we have this two features, x1 and x2 here and we want to classify all this elements You can see that we have the class square and the class rectangle So the goal of the SVM is to design a hyperplane, here we define this green line as the hyperplane, that classifies all the training vectors in two classes Here we show two different hyperplanes which can classify correctly all the instances in this feature set But the best choice will be the hyperplane that leaves the maximum margin from both classes The margin is this distance between the hyperplane and the closest elements from this hyperplane We have the case of the red hyperplane we have this distance, so this is the margin, which we represent by z1 And in the case of the green hyperplane we have the margin that we call z2 We can clearly see that the value of z2 is greater than z1 So the margin is higher in the case of the green hyperplane, so in this case the best choice will be the green hyperplane Suppose we have this hyperplane, this hyperplane is defined by one equation, we can state this equation as this one We have a vector of weights plus omega 0 and this equation will deliver values greater than 1 for all the input vectors which belongs to the class 1, in this case the circles And also, we scale this hyperplane so that it will deliver values smaller than -1 for all values which belongs to class number 2, the rectangles We can say that this distance to the closest elements will be at least 1, the modulus is 1 From the geometry we know that the distance between a point and a hyperplane is computed by this equation So the total margin which is composed by this distance will be computed by this equation And the aim is that minimizing this term will maximize the separability When we minimize this weight vector we will have the biggest margin here that will split this two classes To minimize this weight vector is a nonlinear optimization task, which can be solved by this conditions (KKT), which uses Langrange multipliers The main equations state that the value of omega will be the solution of this sum here And we also have this other rule. So when we solve these equations, trying to minimize this omega vector, we will maximize the margin between the two classes which will maximize the separability the two classes Here we show a simple example Suppose we have these 2 features, x1 and x2, and we have these 3 values We want to design, or to find the best hyperplane that will divide this 2 classes So we know that we can see clearly from this graph that the best division line will be a parallel line to the line that connects these 2 values here So we can define this weight vector, which is this point minus this other point. So we have the constant a and 2 times this constant a Now we can solve this weight vector and create the hyperplane equations considering this weight vector We must discover the values of this a here Since we have this weight vector omega here, we can substitute the values of this point and also using this point we can substitute these 2 values here When we place the equation g using the input vector (1,1) we know that we have the value -1 because this belongs to the class circle So we will have this value here, when we use the second point, we apply the function and we know that it will deliver the value 1 So we substitute here in the equation also Well, given 2 equations we can isolate the value of omega 0 in the second equation and we will have omega 0 equal to 1 minus 8 times a So, using this value, we put the omega 0 in the first equation and we will reach the value of a, which is 2 divided by 5 Now we discover the value of a and now we substitute the first equation and also discover the value of omega 0 So by dividing here we will come to the conclusion that omega 0 is minus 11 divided by 5 and since we know that the weight vector is a and 2 a we can substitute the value of a here and we will deliver these values of the weight vector So in this case, these are called the support vectors because they compose the omega value 2 divided by 5 and 4 divided by 5 And we substitute here the values of omega (2 divided by 5 and 4 divided by 5) and also the omega 0 value we will deliver the final equation which defines this green hyperplane which is x1 plus 2 times x2 minus 5.5 And this hyperplane classifies the elements using support vector machines These are some references that we have used So this is how SVM algorithm works
B2 中高級 美國腔 SVM(支持向量機)算法的工作原理 (How SVM (Support Vector Machine) algorithm works) 173 7 dœm 發佈於 2021 年 01 月 14 日 更多分享 分享 收藏 回報 影片單字