As Natural Language Processing (NLP) technology rapidly develops and spreads into daily life, it becomes crucial to anticipate how its use could harm people. However, our ways of assessing the biases of NLP models have not kept up. While especially the detection of English gender bias in such models has enjoyed increasing research attention, many of the measures face serious problems, as it is often unclear what they actually measure and how much they are subject to measurement error. In this paper, we provide an interdisciplinary approach to discussing the issue of NLP model bias by adopting the lens of psychometrics -- a field specialized in the measurement of concepts like bias that are not directly observable. We pair an introduction of relevant psychometric concepts with a discussion of how they could be used to evaluate and improve bias measures. We also argue that adopting psychometric vocabulary and methodology can make NLP bias research more efficient and transparent.

本文综述了自然语言处理技术的快速发展所带来的问题，尤其是如何检测这些技术中的偏见。作者讨论了适用于评估和改进这些偏见测量方法的心理测量学概念，并认为采用心理测量学词汇和方法可以使NLP偏见研究更有效和透明。

自然语言处理中不良偏见：避免度量危机