Many re-ranking strategies in search systems rely on stochastic ranking
policies, encoded as Doubly-Stochastic (DS) matrices, that satisfy desired
ranking constraints in expectation, e.g., Fairness of Exposure (FOE). These
strategies are generally two-stage pipelines: \emph{i)} an offline re-ranking
policy construction step and \emph{ii)} an online sampling of rankings step.
Building a re-ranking policy requires repeatedly solving a constrained
optimization problem, one for each issued query. Thus, it is necessary to
recompute the optimization procedure for any new/unseen query. Regarding
sampling, the Birkhoff-von-Neumann decomposition (BvND) is the favored approach
to draw rankings from any DS-based policy. However, the BvND is too costly to
compute online. Hence, the BvND as a sampling solution is memory-consuming as
it can grow as $\gO(N\, n^2)$ for $N$ queries and $n$ documents.
This paper offers a novel, fast, lightweight way to predict fair stochastic
re-ranking policies: Constrained Meta-Optimal Transport (CoMOT). This method
fits a neural network shared across queries like a learning-to-rank system. We
also introduce Gumbel-Matching Sampling (GumMS), an online sampling approach
from DS-based policies. Our proposed pipeline, CoMOT + GumMS, only needs to
store the parameters of a single model, and it generalizes to unseen queries.
We empirically evaluated our pipeline on the TREC 2019 and 2020 datasets under
FOE constraints. Our experiments show that CoMOT rapidly predicts fair
re-ranking policies on held-out data, with a speed-up proportional to the
average number of documents per query. It also displays fairness and ranking
performance similar to the original optimization-based policy. Furthermore, we
empirically validate the effectiveness of GumMS to approximate DS-based
policies in expectation.

本文提出了一种新的，快速，轻量级的用于预测公平随机重排序策略的方法：Constrained Meta-Optimal Transport (CoMOT) 及其在线采样方法 Gumbel-Matching Sampling (GumMS)，通过共享神经网络跨查询适配。在 TREC 2019 和 2020 数据集下，实验证明，该方法在保持公平性和排序性能不变的情况下，在不需要每个新查询重新计算优化过程的情况下，快速预测出公平的其余数据集上的重排序策略。

受限元最优输运下的重排学习

Learning to Re-rank with Constrained Meta-Optimal Transport

Conventional Learning-to-Rank (LTR) methods optimize the utility of the
rankings to the users, but they are oblivious to their impact on the ranked
items. However, there has been a growing understanding that the latter is
important to consider for a wide range of ranking applications (e.g. online
marketplaces, job placement, admissions). To address this need, we propose a
general LTR framework that can optimize a wide range of utility metrics (e.g.
NDCG) while satisfying fairness of exposure constraints with respect to the
items. This framework expands the class of learnable ranking functions to
stochastic ranking policies, which provides a language for rigorously
expressing fairness specifications. Furthermore, we provide a new LTR algorithm
called Fair-PG-Rank for directly searching the space of fair ranking policies
via a policy-gradient approach. Beyond the theoretical evidence in deriving the
framework and the algorithm, we provide empirical results on simulated and
real-world datasets verifying the effectiveness of the approach in individual
and group-fairness settings.

本文提出了一种利用随机排序策略来进行公平学习及考虑排序项影响的通用 LTR 框架，并通过基于政策梯度方法的 Fair-PG-Rank 算法进行优化，可在保持曝光公平性的情况下优化各种效用指标。通过实验结果验证了此方法在个人和集体公平性方面的有效性。