Binary classification is a type of supervised learning that involves prediction over two classes. Examples of binary classification problems include determining whether an email is spam or "ham" (non-spam), or whether or not a customer is likely to convert based on how they have engaged with promotional material.
Support vector machines, or SVMs for short, are a common binary classification technique known for its versatility and robustness. In a nutshell, SVMs rely on constructing the boundary that creates maximum separation between the two classes—the one that has maximum margin from the nearest points of the classes.
Compared to other binary classification approaches, SVM models exhibit good generalisation, perform well on high-dimensional data, and are known to be easier to interpret. On the other hand, due to their non-probabilistic nature, ordinary SVMs do not directly output probability distributions or confidence scores over classes.
As with all binary classifiers, the performance of SVMs can be quantified using a variety of metrics such as precision, recall (sensitivity), and specificity (selectivity). In general, dimensionality reduction is often employed to boost the accuracy and performance of all types of predictive models, including SVMs.
SVMs are available in a number of well-known machine learning libraries such as Python's scikit-learn. Internally, scikit-learn utilises libsvm and liblinear that are implemented in C++ to efficiently handle all computations.