生成语言模型的非歧视标准

Mar, 2024

Non-discrimination Criteria for Generative Language Models

Sara Sterlie, Nina Weng, Aasa Feragen

TL;DR研究如何发现和量化生成语言模型中的性别偏见，并设计了针对职业性别刻板印象的标准，通过职业性别刻板印象的测试结果论证了这些标准在生成人工智能模型中的存在。

Abstract

Within recent years, generative ai, such as large language models, has undergone rapid development. As these models become increasingly available to the public, concerns arise about perpetuating and amplifying ha