The categorical error metric measures how often the model gets the prediction wrong. Since it should be decreasing with training, it is convenient to use with log scaling.
In a multiclass classification problem, we consider a prediction to be wrong when the class with the highest score doesn’t match the class of the target feature.
Categorical error = 0, means the model’s predictions are perfect.
The formula for categorical error is:
It’s the complement of the categorical accuracy.
Suggestions on how to improve
If there is a large discrepancy between training and validation categorical error (called overfitting), try to introduce dropout and/or batch normalization blocks to improve generalization. Overfitting means that the model performs well when it’s shown a training example (resulting in a low training loss), but badly when it’s shown a new example it hasn’t seen before (resulting in a high validation loss).
A large discrepancy can also show that the validation data are too different from the training data.
If the training categorical error is high, the model is not learning well enough. Try to build a new model or collect more training data.