The squared hinge loss is a loss function used for “maximum margin” binary classification problems. Mathematically it is defined as:
where ŷ the predicted value and y is either 1 or 1. Thus, the squared hinge loss is:
0 
* when the true and predicted labels are the same and
* when ŷ≥ 1 (which is an indication that the classifier is sure that it’s the correct label) 
quadratically increasing with the error 
* when the true and predicted labels are not the same or
* when ŷ< 1, even when the true and predicted labels are the same (which is an indication that the classifier is not sure that it’s the correct label) 
Note

ŷ should be the actual numerical output of the classifier and not the predicted label.

The hinge loss guarantees that, during training, the classifier will find the classification boundary which is the furthest apart from each of the different classes of data points as possible. In other words, it finds the classification boundary that guarantees the maximum margin between the data points of the different classes.