BriefGPT.xyz
Nov, 2019
神经机器翻译合成数据中的领域、翻译语体和噪声
Domain, Translationese and Noise in Synthetic Data for Neural Machine Translation
HTML
PDF
Nikolay Bogoychev, Rico Sennrich
TL;DR
通过利用附加单语资源来创建合成训练数据,可以提高神经机器翻译的质量,本文探讨了前向翻译和反向翻译在翻译源语句子和目标语句子时的优点,并研究了不同地域、语言和噪音环境下翻译的影响。另外,本文还给出了低资源情况下前向和反向翻译的比较。
Abstract
The quality of
neural machine translation
can be improved by leveraging additional monolingual resources to create synthetic training data. Source-side
monolingual data
can be (forward-)translated into the target
→