评估词向量性别偏差度量的可靠性

Sep, 2021

评估词向量性别偏差度量的可靠性

Assessing the Reliability of Word Embedding Gender Bias Measures

Yupei Du, Qixiang Fang, Dong Nguyen

TL;DR本研究评估了三种类型的词嵌入性别偏见度量的可信度，包括测试再测试可靠性、评分者一致性和内部一致性，并考察了不同随机种子、评分规则和单词选择等因素对可信度的影响，结果有助于更好地设计性别偏差度量，同时也建议研究者对这些度量的应用更加持批判态度。

Abstract

Various measures have been proposed to quantify human-like social biases in word embeddings. However, bias scores based on these measures can suffer from measurement error. One indication of →