The growing integration of large language models (LLMs) into social
operations amplifies their impact on decisions in crucial areas such as
economics, law, education, and healthcare, raising public concerns about these
models' discrimination-related safety and reliability. However, prior
discrimination measuring frameworks solely assess the average discriminatory
behavior of LLMs, often proving inadequate due to the overlook of an additional
discrimination-leading factor, i.e., the LLMs' prediction variation across
diverse contexts. In this work, we present the Prejudice-Caprice Framework
(PCF) that comprehensively measures discrimination in LLMs by considering both
their consistently biased preference and preference variation across diverse
contexts. Specifically, we mathematically dissect the aggregated contextualized
discrimination risk of LLMs into prejudice risk, originating from LLMs'
persistent prejudice, and caprice risk, stemming from their generation
inconsistency. In addition, we utilize a data-mining approach to gather
preference-detecting probes from sentence skeletons, devoid of attribute
indications, to approximate LLMs' applied contexts. While initially intended
for assessing discrimination in LLMs, our proposed PCF facilitates the
comprehensive and flexible measurement of any inductive biases, including
knowledge alongside prejudice, across various modality models. We apply our
discrimination-measuring framework to 12 common LLMs, yielding intriguing
findings: i) modern LLMs demonstrate significant pro-male stereotypes, ii)
LLMs' exhibited discrimination correlates with several social and economic
factors, iii) prejudice risk dominates the overall discrimination risk and
follows a normal distribution, and iv) caprice risk contributes minimally to
the overall risk but follows a fat-tailed distribution, suggesting that it is
wild risk requiring enhanced surveillance.

通过考虑大型语言模型的持久偏见和生成不一致性，我们在本文中提出了偏见 - 反复性框架（PCF），从而全面测量 LLMs 中的歧视行为。我们对 12 个常见 LLMs 应用我们的歧视测量框架，发现现代 LLMs 存在显著的男性偏见，并且 LLMs 的歧视行为与多个社会和经济因素相关。