Mean absolute error

Mean absolute error (MAE) is a loss function used for regression. The loss is the mean overseen data of the absolute differences between true and predicted values, or writing it a formula:

\[L(y, \hat{y}) = \frac{1}{N} \sum_{i=0}^{N}|y - {\hat{y}}_i|\]

where ŷ is the predicted value.

Why use mean absolute error

MAE is not sensitive towards outliers and given several examples with the same input feature values, and the optimal prediction will be their median target value. This should be compared with Mean Squared Error, where the optimal prediction is the mean. A disadvantage of MAE is that the gradient magnitude is not dependent on the error size, only on the sign of y - ŷ. This leads to that the gradient magnitude will be large even when the error is small, which in turn can lead to convergence problems.

When to use mean absolute error

Use Mean absolute error when you are doing regression and don’t want outliers to play a big role. It can also be useful if you know that your distribution is multimodal, and it’s desirable to have predictions at one of the modes, rather than at the mean of them.

Example: When doing image reconstruction, MAE encourages less blurry images compared to MSE. This is used for example in the paper Image-to-Image Translation with Conditional Adversarial Networks by Isola et al.

Was this page helpful?
Yes No