Recent work has shown deep neural networks (DNNs) to be highly susceptible to well-designed, small perturbations at the input layer, or so-called adversarial examples. Taking images as an example, such distortions are often imperceptible, but can result in 100% mis-classification for a state of the art DNN. We study the structure of adversarial examples and explore network topology, pre-processing and training strategies to improve the robustness of DNNs. We perform various experiments to assess the removability of adversarial examples by corrupting with additional noise and pre-processing with denoising autoencoders (DAEs). We find that DAEs can remove substantial amounts of the adversarial noise. How- ever, when stacking the DAE with the original DNN, the resulting network can again be attacked by new adversarial examples with even smaller distortion. As a solution, we propose Deep Contractive Network, a model with a new end-to-end training procedure that includes a smoothness penalty inspired by the contractive autoencoder (CAE). This increases the network robustness to adversarial exam- ples, without a significant performance penalty.

该研究论文研究了深度神经网络的鲁棒性问题，特别是针对对抗样本的攻击。通过探索神经网络的结构，拓扑结构，预处理和训练策略等方面来提高深度神经网络的抗干扰能力，并且通过引入平滑性惩罚来提高其稳健性。

面向对抗样本具鲁棒性的深度神经网络架构