State-of-the-art deep neural networks are known to be vulnerable to adversarial examples, formed by applying small but malicious perturbations to the original inputs. Moreover, the perturbations can \textit{transfer across models}: adversarial examples generated for a specific model will often mislead other unseen models. Consequently the adversary can leverage it to attack deployed systems without any query, which severely hinder the application of deep learning, especially in the areas where security is crucial. In this work, we systematically study how two classes of factors that might influence the transferability of adversarial examples. One is about model-specific factors, including network architecture, model capacity and test accuracy. The other is the local smoothness of loss function for constructing adversarial examples. Based on these understanding, a simple but effective strategy is proposed to enhance transferability. We call it variance-reduced attack, since it utilizes the variance-reduced gradient to generate adversarial example. The effectiveness is confirmed by a variety of experiments on both CIFAR-10 and ImageNet datasets.

本文系统研究了影响对抗样本传递性的两类因素，包括网络结构、测试精度等模型特定因素和构建对抗样本的损失函数的局部光滑性。基于这些理解，提出了一种简单而有效的策略来增强传递性，称为方差降低攻击，因为它利用方差降低梯度来生成对抗样本，实验结果表明其有效性。

理解和提升对抗样本的可迁移性