Bias sources in data may come from the data itself and the way we collect or process it.
If the data we collected is not a good representation of the real-world examples, or if the way we label the data is inconsistent the resulting dataset will be biased.
AI algorithms learn from the data. It is important to be aware of potential biases in the data and to keep in mind how to potentially prevent them regardless of the bias type.
In-depth articles on data bias
General tips on how to prevent data bias
Start with being aware of common bias types.
Understand the outliers in your data and how to handle them.
Be sure that you have the domain knowledge in that area, or be sure that someone with good domain knowledge audits your data.
Audit your data on a regular basis to detect changes over time.
Create datasheets for data reporting with how you created your data and the characteristics of your data.