TL;DR通过在敏感属性上等化合成数据生成器的目标概率分布,使用 AI 生成的合成数据进行训练的下游模型能够提供公平的预测,使得即使从偏见的原始数据推断出来,也能够提供强大的公平预测。
Abstract
ai-generated synthetic data, in addition to protecting the privacy of
original data sets, allows users and data consumers to tailor data to their
needs. This paper explores the creation of synthetic data that embodies
f