Adversarial training (AT) is one of the most effective strategies for promoting model robustness. However, recent benchmarks show that most of the proposed improvements on AT are less effective than simply early stopping the training procedure. This counter-intuitive fact motivates us to investigate the implementation details of tens of AT methods. Surprisingly, we find that the basic training settings (e.g., weight decay, learning rate schedule, etc.) used in these methods are highly inconsistent, which could largely affect the model performance as shown in our experiments. For example, a slightly different value of weight decay can reduce the model robust accuracy by more than 7%, which is probable to override the potential promotion induced by the proposed methods. In this work, we provide comprehensive evaluations on the effects of basic training tricks and hyperparameter settings for adversarially trained models. We provide a reasonable baseline setting and re-implement previous defenses to achieve new state-of-the-art results.

本文研究了对抗训练中常常忽视的训练技巧和超参数的影响，发现对于鲁棒性的影响比以往认为的更加敏感，如微调权重衰减值可能会使模型鲁棒性下降超过7%，而与此同时，我们基于这些发现重新设计了一个基线模型并获得了目前最优的结果。

对抗训练的技巧