We present WinoQueer: a benchmark specifically designed to measure whether large language models (LLMs) encode biases that are harmful to the LGBTQ+ community. The benchmark is community-sourced, via application of a novel method that generates a bias benchmark from a community survey. We apply our benchmark to several popular LLMs and find that off-the-shelf models generally do exhibit considerable anti-queer bias. Finally, we show that LLM bias against a marginalized community can be somewhat mitigated by finetuning on data written about or by members of that community, and that social media text written by community members is more effective than news text written about the community by non-members. Our method for community-in-the-loop benchmark development provides a blueprint for future researchers to develop community-driven, harms-grounded LLM benchmarks for other marginalized communities.

WinoQueer是一个针对衡量大型语言模型是否存在有害LGBTQ+社区的偏见的基准，该论文通过社区调查来确定偏见审核基准。基准测试了多个热门LLM模型，发现开箱即用的模型通常存在相当大的反同偏见。最后，我们展示了LLM对边缘化社区的偏见可以通过微调社区成员编写的数据来得到缓解，而社交媒体文本比非成员编写的新闻文本更有效。

WinoQueer：针对大型语言模型中反LGBTQ+偏见的社区参与基准测试