通过不变量合理化降低有害语言检测中的偏见

Jun, 2021

Mitigating Biases in Toxic Language Detection through Invariant Rationalization

Yung-Sung Chuang, Mingye Gao, Hongyin Luo, James Glass, Hung-yi Lee...

TL;DR通过使用不变量理性化 (InvRat) 方法，我们可以降低对某些语法模式的误判，从而避免使用带有偏见的训练数据集导致毒性过滤器产生偏见，进而加剧群体边缘化的现象。

Abstract

Automatic detection of toxic language plays an essential role in protecting social media users, especially minority groups, from verbal ab