尾巴摇动狗：社会偏见基准数据集构建偏差

Oct, 2022

尾巴摇动狗：社会偏见基准数据集构建偏差

The Tail Wagging the Dog: Dataset Construction Biases of Social Bias Benchmarks

Nikil Roashan Selvam, Sunipa Dev, Daniel Khashabi, Tushar Khot, Kai-Wei Chang

TL;DR本文通过对社会偏见分数的可靠性进行研究，发现在数据集构建中的非社会偏见对比社会偏见产生了巨大的影响，因此需要更加健壮的社会偏见度量方法。

Abstract

How reliably can we trust the scores obtained from social bias benchmarks as faithful indicators of problematic social biases in a given language model? In this work, we study this question by contrasting