TL;DR通过研究核心属性 Z 的规模,我们开发了一种可伸缩且有原则的两阶段估计过程,可以评估最先进模型的稳健性,证明了我们的方法认证模型的稳健性,防止部署不可靠的模型。
Abstract
The performance of ml models degrades when the training population is
different from that seen under operation. Towards assessing distributional
robustness, we study the worst-case performance of a model over all