Deep learning-based methods have pushed the limits of the state-of-the-art in face analysis. However, despite their success, these models have raised concerns regarding their bias towards certain demographics. This bias is inflicted both by limited diversity across demographics in the training set, as well as the design of the algorithms. In this work, we investigate the demographic bias of deep learning models in face recognition, age estimation, gender recognition and kinship verification. To this end, we introduce the most comprehensive, large-scale dataset of facial images and videos to date. It consists of 40K still images and 44K sequences (14.5M video frames in total) captured in unconstrained, real-world conditions from 1,045 subjects. The data are manually annotated in terms of identity, exact age, gender and kinship. The performance of state-of-the-art models is scrutinized and demographic bias is exposed by conducting a series of experiments. Lastly, a method to debias network embeddings is introduced and tested on the proposed benchmarks.

本文研究基于深度学习技术的面部识别、年龄估计、性别识别和亲属关系验证模型中存在的人口统计学偏差，并通过引入规模最大、最全面的面部图像和视频数据集及手动注释，揭示了基于最先进模型的拟合性能和偏差，最后引入和验证了去偏嵌入网络的方法。

探究深度面部分析中的偏差：KANFace数据集和实证研究