Historical bias

Historical bias occurs when the historical data used for the AI model involves bias. It might occur:

  • When the data involves bias already such as human prejudice or discrimination

  • When the data no longer reflects reality

  • When there is incomplete or incorrect data

Example: When candidates that previously got hired at a company graduated from a certain university the algorithm will be biased towards similar candidates.

Hired staff with more Harvard graduates

How to prevent historical bias

Historical bias might occur even when the data is perfectly sampled and collected. You might have a proper random sampling, you might have collected your data in the most proper way but you might still have a bias in your data since historical bias arises from the data itself.

To prevent historical bias, you can

  • Audit the data you collected on a regular basis.

  • Ensure that you have enough domain knowledge or that someone with enough domain knowledge audits your data.

Was this page helpful?