📕 Support Vector Machine is an algorithm used for binary classification. In it’s vanilla form, it is an effective yet simple algorithm that can be used when two categories are distinctly different from one another. In this tutorial, we dive deeper into SVMs, understanding the intuition behind the algorithm and how it works.

1. How are SVMs used?

SVMs are primarily designed for binary classification tasks where data can be divided into two distinct categories. The SVM algorithm accomplishes this by identifying a line that effectively separates the two categories, ensuring a clear boundary between them. SVM is simple yet intuitive and works best when the dimensionality of the data is low.

2. How do SVMs work?

You can think of Support Vector Machine algorithm as one that crafts a boundary. Ideally, the boundary separates the data into two categories, data points on the left of the boundary and data points on the right.

In simple terms, you can think of SVM performing the following steps

1) Create a vector to project the points upon

Create a vector to project the points upon

Project each point on the vector to get a distance (the scalar projection)

3) Choose a distance (labeled A) to separate the categories

Choose a distance (labeled A) to separate the categories

4) Any points that when projected result in a distance shorter than A can be classified as belonging to one category, while the remaining points can be classified as belonging to the other category.

Any points that when projected result in a distance shorter than A can be classified as belonging to one category, while the remaining points can be classified as belonging to the other category.

3. How is the SVM algorithm trained?

From above, you notice that the SVM algorithm is contingent on two things. The vector in which the points are projected upon, and the distance used to separate the two categories (the boundary).

3.1 Optimizing the Vector that the points are projected upon

In any optimization problem, it is important to understand the goal of the optimization. In this case, the vector plays a role in classification. Let us call the vector $u$