Evaluation bias occurs when evaluating a model, and the data used to compare the models with the same purpose, the benchmark data, does not represent the general population.
Example: When the dataset is used to benchmark the movie review feelings sentiment analysis model consists of positive feedback mostly.
How to prevent evaluation bias
Make sure that your model doesn’t perform well only with the benchmark data but also with a different set of data.
Check the distribution of your data and see if your benchmark data represents the general population.