Language Models (LMs) have been shown to inherit undesired biases that might
hurt minorities and underrepresented groups if such systems were integrated
into real-world applications without careful fairness auditing. This paper
proposes FairBelief, an analytical approach to capture and assess beliefs,
i.e., propositions that an LM may embed with different degrees of confidence
and that covertly influence its predictions. With FairBelief, we leverage
prompting to study the behavior of several state-of-the-art LMs across
different previously neglected axes, such as model scale and likelihood,
assessing predictions on a fairness dataset specifically designed to quantify
LMs' outputs' hurtfulness. Finally, we conclude with an in-depth qualitative
assessment of the beliefs emitted by the models. We apply FairBelief to English
LMs, revealing that, although these architectures enable high performances on
diverse natural language processing tasks, they show hurtful beliefs about
specific genders. Interestingly, training procedure and dataset, model scale,
and architecture induce beliefs of different degrees of hurtfulness.

通过 FairBelief 分析方法，我们揭示了英语语言模型普遍具有关于特定性别的伤害性信念，不同的训练过程、数据集、模型规模和架构会引发各种程度的伤害性信念。

公平信念评估 —— 评估语言模型中的有害信念

FairBelief - Assessing Harmful Beliefs in Language Models

As artificial intelligence (AI) increasingly becomes an integral part of our
societal and individual activities, there is a growing imperative to develop
responsible AI solutions. Despite a diverse assortment of machine learning
fairness solutions is proposed in the literature, there is reportedly a lack of
practical implementation of these tools in real-world applications. Industry
experts have participated in thorough discussions on the challenges associated
with operationalising fairness in the development of machine learning-empowered
solutions, in which a shift toward human-centred approaches is promptly
advocated to mitigate the limitations of existing techniques. In this work, we
propose a human-in-the-loop approach for fairness auditing, presenting a mixed
visual analytical system (hereafter referred to as 'FairCompass'), which
integrates both subgroup discovery technique and the decision tree-based schema
for end users. Moreover, we innovatively integrate an Exploration, Guidance and
Informed Analysis loop, to facilitate the use of the Knowledge Generation Model
for Visual Analytics in FairCompass. We evaluate the effectiveness of
FairCompass for fairness auditing in a real-world scenario, and the findings
demonstrate the system's potential for real-world deployability. We anticipate
this work will address the current gaps in research for fairness and facilitate
the operationalisation of fairness in machine learning systems.

提出了一种名为 'FairCompass' 的人在循环中的公平审计方法，通过混合可视化分析系统将子组发现技术和基于决策树的模式集成到终端用户中，以促进可视分析的知识生成模型的使用，在实际情境中评估了 FairCompass 的公平审计效果，结果显示该系统在实际应用上具有潜力，望能填补公平性研究中的现行空白并促进在机器学习系统中的公平性操作实施。

FairCompass：机器学习中的公平操作

FairCompass: Operationalising Fairness in Machine Learning

We provide practical, efficient, and nonparametric methods for auditing the
fairness of deployed classification and regression models. Whereas previous
work relies on a fixed-sample size, our methods are sequential and allow for
the continuous monitoring of incoming data, making them highly amenable to
tracking the fairness of real-world systems. We also allow the data to be
collected by a probabilistic policy as opposed to sampled uniformly from the
population. This enables auditing to be conducted on data gathered for another
purpose. Moreover, this policy may change over time and different policies may
be used on different subpopulations. Finally, our methods can handle
distribution shift resulting from either changes to the model or changes in the
underlying population. Our approach is based on recent progress in
anytime-valid inference and game-theoretic statistics-the "testing by betting"
framework in particular. These connections ensure that our methods are
interpretable, fast, and easy to implement. We demonstrate the efficacy of our
methods on several benchmark fairness datasets.

本文提出了一种使用非参数方法、连续监控、基于概率策略和适应分布变化等特征的公平性审计方法，并在多个基准公平性数据集上验证其有效性。

通过赌博审计公平性

Auditing Fairness by Betting

Before deploying a black-box model in high-stakes problems, it is important
to evaluate the model's performance on sensitive subpopulations. For example,
in a recidivism prediction task, we may wish to identify demographic groups for
which our prediction model has unacceptably high false positive rates or
certify that no such groups exist. In this paper, we frame this task, often
referred to as "fairness auditing," in terms of multiple hypothesis testing. We
show how the bootstrap can be used to simultaneously bound performance
disparities over a collection of groups with statistical guarantees. Our
methods can be used to flag subpopulations affected by model underperformance,
and certify subpopulations for which the model performs adequately. Crucially,
our audit is model-agnostic and applicable to nearly any performance metric or
group fairness criterion. Our methods also accommodate extremely rich -- even
infinite -- collections of subpopulations. Further, we generalize beyond
subpopulations by showing how to assess performance over certain distribution
shifts. We test the proposed methods on benchmark datasets in predictive
inference and algorithmic fairness and find that our audits can provide
interpretable and trustworthy guarantees.

通过多重假设检验，在统计保证的前提下，使用自助法在子人群的集合中同时限制性能差异，从而识别受模型性能不足影响的子人群并验证模型在某些子人群中的适用性。此外，该方法还兼容超丰富甚至无限的子人群集合，并支持评估在某些分布变化下的性能。