BriefGPT.xyz
Feb, 2024
模型编辑用于社会去偏倚的潜力和挑战
Potential and Challenges of Model Editing for Social Debiasing
HTML
PDF
Jianhao Yan, Futing Wang, Yafu Li, Yue Zhang
TL;DR
大型语言模型具有刻板印象偏见,模型编辑方法能够缓解这一问题,本研究通过综合性研究从多个角度评估了七种模型编辑算法在刻板偏见消除中的潜力和挑战,同时提出了两种简单有效的方法以提升刻板偏见的编辑效果。
Abstract
large language models
(LLMs) trained on vast corpora suffer from inevitable
stereotype biases
. Mitigating these biases with fine-tuning could be both costly and data-hungry.
→