Modern machine learning and deep learning models are shown to be vulnerable when testing data are slightly perturbed. Theoretical studies of adversarial training algorithms mostly focus on their adversarial training losses or local convergence properties. In contrast, this paper studies the generalization performance of a generic adversarial training algorithm. Specifically, we consider linear regression models and two-layer neural networks (with lazy training) using squared loss under both low-dimensional and high-dimensional regimes. In the former regime, the adversarial risk of the trained models will converge to the minimal adversarial risk. In the latter regime, we discover that data interpolation prevents the adversarial robust estimator from being consistent (i.e. converge in probability). Therefore, inspired by successes of the least absolute shrinkage and selection operator (LASSO), we incorporate the L1 penalty in the high dimensional adversarial learning, and show that it leads to consistent adversarial robust estimation in both theory and numerical trials.

本论文研究了一种泛用的对抗训练算法的泛化性能，并考虑了线性回归模型和两层神经网络（使用平方损失）在低维和高维情况下的表现，其中，我们发现数据内插会防止对抗性鲁棒估算器的一致性，因此，我们引入L1惩罚，在高维对抗学习中，证明了它可以导致一致的对抗性鲁棒估计。

关于对抗训练的泛化性质