The issue of disparities in face recognition accuracy across demographic groups has attracted increasing attention in recent years. Various face image datasets have been proposed as 'fair' or 'balanced' to assess the accuracy of face recognition algorithms across demographics. While these datasets often balance the number of identities and images across demographic groups. It is important to note that the number of identities and images in an evaluation dataset are not the driving factors for 1-to-1 face matching accuracy. Moreover, balancing the number of identities and images does not ensure balance in other factors known to impact accuracy, such as head pose, brightness, and image quality. We demonstrate these issues using several recently proposed datasets. To enhance the capacity for less biased evaluations, we propose a bias-aware toolkit that facilitates the creation of cross-demographic evaluation datasets balanced on factors mentioned in this paper.

本文讨论人脸识别准确率差异的问题，指出虽然许多数据集都平衡了各个人群中身份的数量和图像的数量，但身份和图像数量并不是影响一对一人脸匹配准确性的决定因素，因此需要一个更具有偏差意识的工具包来创建跨人群的平衡评估数据集。

理解偏见需真正均衡的数据集？影响准确性的因素并非身份和图像数量