A necessary characteristic for the deployment of deep learning models in real world applications is resistance to small adversarial perturbations while maintaining accuracy on non-malicious inputs. While robust training provides models that exhibit better adversarial accuracy than standard models, there is still a significant gap in natural accuracy between robust and non-robust models which we aim to bridge. We consider a number of ensemble methods designed to mitigate this performance difference. Our key insight is that model trained to withstand small attacks, when ensembled, can often withstand significantly larger attacks, and this concept can in turn be leveraged to optimize natural accuracy. We consider two schemes, one that combines predictions from several randomly initialized robust models, and the other that fuses features from robust and standard models.

为了提高深度学习模型在实际应用中对小型对抗扰动的抵抗力和非恶意输入的准确性，我们考虑了一些集成方法，关键洞见在于训练模型以抵御小型攻击的模型在集成时可以承受更大的攻击，并且可以通过这个概念来优化自然准确性。我们考虑了两种方案，一种是从几个随机初始化的强韧模型中组合预测，另一种则是将强韧模型和标准模型的特征进行融合。

在对抗性环境下重新审视集成: 提高自然精度