BriefGPT.xyz
Jul, 2023
在交叉背景下评估语言模型的偏见态度关联
Evaluating Biased Attitude Associations of Language Models in an Intersectional Context
HTML
PDF
Shiva Omrani Sabbaghi, Robert Wolfe, Aylin Caliskan
TL;DR
利用上下文词嵌入的概念投射方法,量化了英语语言模型中社会群体的情感倾向,发现语言模型对性别认同、社会阶级和性取向的信号表现出最有偏见的态度,此方法旨在研究语言模型中的历史偏见,并对设计正义做出贡献,探讨了在语言中被边缘化的群体的相关关系。
Abstract
language models
are trained on large-scale corpora that embed implicit biases documented in psychology.
valence associations
(pleasantness/unpleasantness) of social groups determine the
→