In this work, we show that information about the context of an input $X$ can
improve the predictions of deep learning models when applied in new domains or
production environments. We formalize the notion of context as a
permutation-invariant representation of a set of data points that originate
from the same environment/domain as the input itself. These representations are
jointly learned with a standard supervised learning objective, providing
incremental information about the unknown outcome. Furthermore, we offer a
theoretical analysis of the conditions under which our approach can, in
principle, yield benefits, and formulate two necessary criteria that can be
easily verified in practice. Additionally, we contribute insights into the kind
of distribution shifts for which our approach promises robustness. Our
empirical evaluation demonstrates the effectiveness of our approach for both
low-dimensional and high-dimensional data sets. Finally, we demonstrate that we
can reliably detect scenarios where a model is tasked with unwarranted
extrapolation in out-of-distribution (OOD) domains, identifying potential
failure cases. Consequently, we showcase a method to select between the most
predictive and the most robust model, circumventing the well-known trade-off
between predictive performance and robustness.

输入 $X$ 的上下文信息可以改善深度学习模型在新领域或生产环境中的预测能力。我们提出了上下文的概念，作为一组数据点的排列不变表示，共同学习于标准监督学习目标，为未知结果提供增量信息。我们通过理论分析和实证评估证明了该方法的有效性，并对其鲁棒性进行了探究。此外，我们还展示了一种选择最具预测性和最具鲁棒性模型的方法，从而避免了预测性能和鲁棒性之间的平衡问题。

朝向上下文感知的领域泛化：用置换不变网络表示环境

Towards Context-Aware Domain Generalization: Representing Environments  with Permutation-Invariant Networks

We propose a new method for count-based exploration in high-dimensional state
spaces. Unlike previous work which relies on density models, we show that
counts can be derived by averaging samples from the Rademacher distribution (or
coin flips). This insight is used to set up a simple supervised learning
objective which, when optimized, yields a state's visitation count. We show
that our method is significantly more effective at deducing ground-truth
visitation counts than previous work; when used as an exploration bonus for a
model-free reinforcement learning algorithm, it outperforms existing approaches
on most of 9 challenging exploration tasks, including the Atari game
Montezuma's Revenge.

我们在高维状态空间中提出了一种新的基于计数的探索方法，通过平均来自 Rademacher 分布（或硬币翻转）的样本，得到计数，并使用一个简单的监督学习目标进行优化，可以获得状态的访问计数。此方法在 9 个具有挑战性的探索任务中表现优异，并优于现有的方法。