Among all separating lines, the SVM picks the one with the largest cushion on either side.
From the chapter: Chapter 7: Supervised Learning
Glossary: support vector machine
Transcript
Two clouds of labelled points, red and blue, linearly separable.
Many lines separate them. A hundred lines, a thousand. Which one is best.
The support vector machine answers: the one with the largest margin.
Draw a line. Push two parallel lines outward, one for each class, until they touch the nearest training point. The distance between them is the margin.
Among all separating lines, the SVM picks the one whose margin is widest.
The points sitting on the boundary lines are called support vectors. They alone determine the decision boundary. Move any other point a little and the answer does not change. Move a support vector and the whole boundary shifts.
This is why the SVM is sparse. The solution depends on a small number of training points.
When the data are not linearly separable, allow some points to violate the margin, with a penalty. This is the soft-margin formulation, controlled by a parameter c.
Push the data through a kernel and the same maximum-margin idea works in spaces that would have been intractable to compute in directly.
Maximum margin is the geometric heart of support vector machines.