Machine Learning: Support Vector Machines
Support Vector Machines is a powerful and elegant machine learning algorithm that was developed in 1979 by Vapnik and became well known after it was applied in digital recognion problem (Cortes and Napnik, 1995).
The basic idea of SVM is to determine a hyperplane (widest street or gutter, see above figure), to separate the positive and negative data samples. To maximize the street, we need to minimize equation 3. From the langrange multiplier equation 4, we know that vector W can be expressed as a linear combination of a small subset of the training examples: those that lie exactly on margin (minimum distance to hyperplane), also called support vectors. In another word, the optimization problem of equation 4 only depends on the dot product of pairs of samples (equation 6). For an unknown sample, the decision rule also only depends on the dot product of those support vectors and the unknown sample (equation 5).
Instead of using Lagrange Multiplier to determine the maximum margin hyperplane, we can also use the concept of Convex Hull. The hyperplane is perdenficular to the vector that connects the nearest convex sets.
For training samples that are not linearly separable, we can try to find a kernal function (hyperplane) that maps training samples to a higher dimensional space (feature space) in which they are linearly separable, and do the classification in that higherdimensional space. Commonly used kernal functions include polynomial kernal, Gaussian kernel, Gaussian radial basis function, and Sigmoid kernel.
Here is an illustration of a walk-through example of SVM:
References:
http://image.diku.dk/imagecanon/material/cortes_vapnik95.pdf
https://www.youtube.com/watch?v=_PwhiWxHK8o
https://www.youtube.com/watch?v=6nDqY8MPLDM
https://www.youtube.com/watch?v=b2L2PYQIKE8
https://axon.cs.byu.edu/Dan/678/miscellaneous/SVM.example.pdf
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.10.5672&rep=rep1&type=pdf
https://cran.r-project.org/web/packages/e1071/vignettes/svmdoc.pdf
Cortes, Corinna; Vapnik, Vladimir N. (1995). “Support-vector networks” (PDF). Machine Learning. 20 (3): 273–297. CiteSeerX 10.1.1.15.9362. doi:10.1007/BF00994018. http://web.cecs.pdx.edu/~mm/AIFall2011/SVMs.pdf