This paper proposes two intuitive metrics, skew and stereotype, that quantify and analyse the gender bias present in contextual language models when tackling the WinoBias pronoun resolution task. We find evidence that gender stereotype correlates approximately negatively with gender skew in out-of-the-box models, suggesting that there is a trade-off between these two forms of bias. We investigate two methods to mitigate bias. The first approach is an online method which is effective at removing skew at the expense of stereotype. The second, inspired by previous work on ELMo, involves the fine-tuning of BERT using an augmented gender-balanced dataset. We show that this reduces both skew and stereotype relative to its unaugmented fine-tuned counterpart. However, we find that existing gender bias benchmarks do not fully probe professional bias as pronoun resolution may be obfuscated by cross-correlations from other manifestations of gender prejudice. Our code is available online, at https://github.com/12kleingordon34/NLP_masters_project.

本文提出了两个直观的度量标准、skew和stereotype，来量化和分析上下文语言模型应对WinoBias代词消解任务时存在的性别偏见，并通过两种方法调查了如何减少偏见。第一个方法是在线方法，在牺牲刻板印象的代价下有效地消除偏斜。第二个方法是借鉴了ELMo的先前工作，并使用增强的性别平衡数据集微调BERT，结果与无增强微调的BERT相比，降低了skew和stereotype。但是，我们发现现有的性别偏见基准未完全探测到专业偏见，因为代词消解可能会被来自其他性别偏见表现的交叉相关性所混淆。

量化预训练语言模型中的性别偏见和倾斜