# Support Vector Machines¶

SVM is a classification method.

1. Support Vector Classifier.
2. Support Vector Machine.

## Data has No Overlap - Separable Data¶

Maximal Margin Classifier: is a optimization problem in the heart.

## Non-separable Data and Noisy Data¶

We use soft margins.

Procedures:

Maximize , for (unit vector), so that

with and . C is a parameter that is chosen to restrict the method.

### Feature Expansion¶

Construct more variables so that parameter space becomes higher. Effectively, we are constructing a nonlinear hyperplane in the parameter space.

Mathematically speaking, we are reconstructing the true classifier using it’s Taylor expansions.

We have talked about Support Vector Classifiers. What is the SVM then?

It might be computationial expensive to really work out support vector classfiers. One of the tricks that makes the computation easy is to use inner products.

We estimate the parameters and it turns out most of them can be 0. The nonzero ones defines are the actual support vectors that is most important for the classifier. The subset of vectors is selected, by nonzero , and are denoted with .

What is the corresponding formalism for the expanded feature space? The solution is kernel and the method becomes SVM.

One of the popular kernel is radial kernel:

1. SVM is better than Logistic Regression when the data is almost separable.
2. LR with ridge method is better than SVM when it’s the other way around.

## References and Notes¶

© 2018, Lei Ma| GitHub| | | Page Source| changelog| Created with Sphinx