神经机器翻译中的单语数据使用：一项系统研究

Mar, 2019

神经机器翻译中的单语数据使用：一项系统研究

Using Monolingual Data in Neural Machine Translation: a Systematic Study

Franck Burlot, François Yvon

TL;DR本文对神经机器翻译的数据生成进行了系统研究，比较了不同的单语数据使用方法和多个数据生成过程，并介绍了一些便宜易实现的新数据模拟技术。研究发现，通过回译技术生成人工平行数据非常有效，并给出了原因解释。

Abstract

neural machine translation (MT) has radically changed the way systems are developed. A major difference with the previous generation (Phrase-Based MT) is the way monolingual target data, which often abounds, is used in these two paradigms. While Phrase-Based MT can seamlessly integrate