BriefGPT.xyz
Oct, 2024
偏见放大:语言模型作为日益偏见的媒介
Bias Amplification: Language Models as Increasingly Biased Media
HTML
PDF
Ze Wang, Zekun Wu, Jeremy Zhang, Navya Jain, Xin Guan...
TL;DR
本研究针对大型语言模型(LLM)在合成数据训练中导致的偏见放大问题进行了深入探讨。我们提出了一个理论框架,明确偏见放大的发生条件,并通过实验验证了GPT-2在合成数据上的偏见渐增现象,同时探讨了有效的缓解策略。研究发现,偏见和模型崩溃由不同神经元驱动,为模型公平性提供了新的理解途径。
Abstract
As Large
Language Models
(LLMs) become increasingly integrated into various facets of society, a significant portion of online text consequently become synthetic. This raises concerns about
Bias Amplification
, a
→