Label bias

Label bias occurs when there are inconsistencies in labeling the data. That is, the one annotating the dataset uses different labels for the same thing.

Example: In a multilabel classification model, when we label the same pictures in different ways. When one puts the labels “a kid” and “a kid who is smiling”, and someone else labels it “joyful” and “a happy kid”.

Inconsistent labeling for the same image

How to prevent label bias

Label bias mainly occurs due to inconsistencies in your labeling. To prevent it, you can simply

  • Ensure that your labels are consistent.

  • Create a gold standard for labeling your data where the gold standard refers to a set of data labeled in an ideal way that is closest to the objective truth.

Was this page helpful?
Yes No