Neural Machine Translation (NMT) has achieved significant breakthrough in performance but is known to suffer vulnerability to input perturbations. As real input noise is difficult to predict during training, robustness is a big issue for system deployment. In this paper, we improve the robustness of NMT models by reducing the effect of noisy words through a Context-Enhanced Reconstruction (CER) approach. CER trains the model to resist noise in two steps: (1) perturbation step that breaks the naturalness of input sequence with made-up words; (2) reconstruction step that defends the noise propagation by generating better and more robust contextual representation. Experimental results on Chinese-English (ZH-EN) and French-English (FR-EN) translation tasks demonstrate robustness improvement on both news and social media text. Further fine-tuning experiments on social media text show our approach can converge at a higher position and provide a better adaptation.

本文提出了一种通过Context-Enhanced Reconstruction（CER）方法提高神经机器翻译（NMT）在噪音输入下的稳健性的方法，该方法包括通过引入人造干扰词破坏自然性来抵制噪声，并通过提供更好的上下文表示来防止噪声传播。在中英文翻译和法英文翻译任务上的实验证明本方法能够提高新闻和社交媒体文本的稳健性，并且在社交媒体文本上的进一步微调实验表明该方法可以收敛到更高的位置并提供更好的适应性。

解决神经机器翻译在输入扰动中的漏洞