Owing to the susceptibility of deep learning systems to adversarial attacks, there has been a great deal of work in developing (both empirically and certifiably) robust classifiers, but the vast majority has defended against single types of attacks. Recent work has looked at defending against multiple attacks, specifically on the MNIST dataset, yet this approach used a relatively complex architecture, claiming that standard adversarial training can not apply because it "overfits" to a particular norm. In this work, we show that it is indeed possible to adversarially train a robust model against a union of norm-bounded attacks, by using a natural generalization of the standard PGD-based procedure for adversarial training to multiple threat models. With this approach, we are able to train standard architectures which are robust against $\ell_\infty$, $\ell_2$, and $\ell_1$ attacks, outperforming past approaches on the MNIST dataset and providing the first CIFAR10 network trained to be simultaneously robust against $(\ell_{\infty}, \ell_{2},\ell_{1})$ threat models, which achieves adversarial accuracy rates of $(47.6\%, 64.8\%, 53.4\%)$ for $(\ell_{\infty}, \ell_{2},\ell_{1})$ perturbations with radius $\epsilon = (0.03,0.5,12)$.

本研究提出了一种基于PGD-based的方法，该方法融合多种扰动模型来提高深度学习系统的鲁棒性，并在MNIST和CIFAR10数据集上进行了测试。

多扰动模型联合的对抗鲁棒性