通过强化校准缓解语言模型中的政治偏见

Apr, 2021

通过强化校准缓解语言模型中的政治偏见

Mitigating Political Bias in Language Models Through Reinforced Calibration

Ruibo Liu, Chenyan Jia, Jason Wei, Guangxuan Xu, Lili Wang...

TL;DR本文提出了度量 GPT-2 生成中政治偏差的度量标准，并提出了一种强化学习框架来减轻生成文本中的政治偏差。在三个属性上的实证实验中，我们的方法减少了偏见，同时保持了可读性和语义连贯性。

Abstract

Current large-scale language models can be politically biased as a result of the data they are trained on, potentially causing serious problems when they are deployed in real-world settings. In this paper, we describe metrics for measuring →