Neural networks are vulnerable to adversarial examples, i.e. inputs that are imperceptibly perturbed from natural data and yet incorrectly classified by the network. Adversarial training, a heuristic form of robust optimization that alternates between minimization and maximization steps, has proven to be among the most successful methods to train networks that are robust against a pre-defined family of perturbations. This paper provides a partial answer to the success of adversarial training. When the inner maximization problem can be solved to optimality, we prove that adversarial training finds a network of small robust train loss. When the maximization problem is solved by a heuristic algorithm, we prove that adversarial training finds a network of small robust surrogate train loss. The analysis technique leverages recent work on the analysis of neural networks via Neural Tangent Kernel (NTK), combined with online-learning when the maximization is solved by a heuristic, and the expressiveness of the NTK kernel in the $\ell_\infty$-norm.

本文研究神经网络的鲁棒性问题，通过对抗训练的方法提高神经网络对抗扰动的鲁棒性。研究表明，通过对抗训练，网络可以收敛到一个鲁棒的分类器，传统的交叉熵损失函数不适用于训练鲁棒的分类器，也因此需要引入代理损失，并证明鲁棒插值需要更大的模型容量。

过参数化神经网络中对抗训练的收敛