ML: Support Vector Machines

Support Vector Machines is a powerful and elegant machine learning algorithm that was developed in 1979 by Vapnik and became well known after it was applied in digital recognion problem (Cortes and Napnik, 1995).

The basic idea of SVM is to determine a hyperplane (widest street or gutter, see above figure), to separate the positive and negative data samples. To maximize the street, we need to minimize equation 3. From the langrange multiplier equation 4, we know that vector W can be expressed as a linear combination of a small subset of the training examples: those that lie exactly on margin (minimum distance to hyperplane), also called support vectors. In another word, the optimization problem of equation 4 only depends on the dot product of pairs of samples (equation 6). For an unknown sample, the decision rule also only depends on the dot product of those support vectors and the unknown sample (equation 5).

Instead of using Lagrange Multiplier to determine the maximum margin hyperplane, we can also use the concept of Convex Hull. The hyperplane is perdenficular to the vector that connects the nearest convex sets.

For training samples that are not linearly separable, we can try to find a kernal function (hyperplane) that maps training samples to a higher dimensional space (feature space) in which they are linearly separable, and do the classification in that higherdimensional space. Commonly used kernal functions include polynomial kernal, Gaussian kernel, Gaussian radial basis function, and Sigmoid kernel.

Here is an illustration of a walk-through example of SVM:

References:

http://image.diku.dk/imagecanon/material/cortes_vapnik95.pdf

https://www.youtube.com/watch?v=_PwhiWxHK8o

https://www.youtube.com/watch?v=6nDqY8MPLDM

https://www.youtube.com/watch?v=b2L2PYQIKE8

https://axon.cs.byu.edu/Dan/678/miscellaneous/SVM.example.pdf

https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.10.5672&rep=rep1&type=pdf

https://cran.r-project.org/web/packages/e1071/vignettes/svmdoc.pdf

Cortes, Corinna; Vapnik, Vladimir N. (1995). “Support-vector networks” (PDF). Machine Learning. 20 (3): 273–297. CiteSeerX 10.1.1.15.9362. doi:10.1007/BF00994018. http://web.cecs.pdx.edu/~mm/AIFall2011/SVMs.pdf