仇恨言论和辱骂语料库中的交叉偏见

May, 2020

Intersectional Bias in Hate Speech and Abusive Language Datasets

Jae Yeon Kim, Carlos Ortiz, Sarah Nam, Sarah Santiago, Vivek Datta

TL;DR本研究通过对 Twitter 数据集进行分类，发现算法对于辱骂言论和仇恨言论的判别对非裔美国人和非裔男性的偏见较强，这提供了有关算法数据集中交叉偏见的首个系统性证据。

Abstract

algorithms are widely applied to detect hate speech and abusive language in social media. We investigated whether the human-annotated data