We develop a method to generate prediction sets with a guaranteed coverage rate that is robust to corruptions in the training data, such as missing or noisy variables. Our approach builds on conformal prediction, a powerful framework to construct prediction sets that are valid under the i.i.d assumption. Importantly, naively applying conformal prediction does not provide reliable predictions in this setting, due to the distribution shift induced by the corruptions. To account for the distribution shift, we assume access to privileged information (PI). The PI is formulated as additional features that explain the distribution shift, however, they are only available during training and absent at test time. We approach this problem by introducing a novel generalization of weighted conformal prediction and support our method with theoretical coverage guarantees. Empirical experiments on both real and synthetic datasets indicate that our approach achieves a valid coverage rate and constructs more informative predictions compared to existing methods, which are not supported by theoretical guarantees.

我们开发了一种方法，用于生成预测集，其覆盖率在训练数据中存在缺失或噪声变量等损坏情况下是健壮的。我们的方法基于符合性预测，这是一种强大的框架，用于构建在独立同分布假设下有效的预测集。重要的是，简单地应用符合性预测在这种情况下不能提供可靠的预测，因为由损坏引起的分布偏移。为了考虑到分布偏移，我们假设可以访问特权信息（PI）。特权信息被形式化为解释分布偏移的附加特征，然而，它们仅在训练期间可用，在测试时不可用。我们通过引入一种新的加权符合性预测的广义方法来解决这个问题，并支持我们的方法具有理论上的覆盖率保证。在真实数据集和合成数据集上的实证实验表明，我们的方法实现了有效的覆盖率，并构建了比现有方法更具信息性的预测，这些方法不受理论保证支持。

利用特权信息的健壮合拢预测