识别和减少单词级语言模型中的性别偏见

Apr, 2019

识别和减少单词级语言模型中的性别偏见

Identifying and Reducing Gender Bias in Word-Level Language Models

Shikha Bordia, Samuel R. Bowman

TL;DR本研究以性别为例，用度量方法描述了文本语料中的社会问题偏差。提出了语言模型的正则化损失项以减少性别偏差，最终在多个语料库中验证了该方法的有效性。

Abstract

Many text corpora exhibit socially problematic biases, which can be propagated or amplified in the models trained on such data. For example, doctor cooccurs more frequently with male pronouns than female pronouns. In this study we (i) propose a metric to measure gender →