Macro-recall measures the average recall per class. It’s short for macro-averaged recall.
Macro-recall = 1 means the model’s predictions are perfect, all truly positive samples was predicted as the positive class.
All classes treated equally
Macro-recall will be low for models that only perform well on the common classes while performing poorly on the rare classes. It’s, therefore, a complementary metric to the overall accuracy.
Recall is a metric used in binary classification problems to answer the following question: What proportion of actual positives was predicted correctly?
A medical test with high Recall will identify a large proportion of the true disease cases. However, the same test might be over-predicting the positive class and give many false positive predictions!
Recall is defined as:
True positive is when actual positive is predicted positive, and
False negative is when actual positive is predicted negative.
Read more about this in the Confusion matrix entry in the glossary.
Note that you can always check the recall for each individual class in the Confusion matrix on the Evaluation view.
Macro-averaging is used for models with more than 2 target classes, for example, in our tutorial Self sorting wardrobe.
Macro-averaging is performed by first computing the recall of each class, and then taking the average of all recalls.
When macro-averaging, all classes contribute equally regardless of how often they appear in the dataset.
Macro-averaging is the default aggregation method for recall for single-label multi-class problems.
Let’s imagine you have a multi-class classification problem with 3 classes (A, B, C). The first step is to calculate how many True positives (TP) and False positives (FP) we have for each class:
A: 2 TP and 8 FP
B: 1 TP and 1 FP
C: 1 TP and 1 FP
Then we calculate the recall for each class:
RA = 0.2
RB = 0.5
RC = 0.5
And finally we average them: