BriefGPT.xyz
Nov, 2017
合成和自然噪声都会破坏神经机器翻译
Synthetic and Natural Noise Both Break Neural Machine Translation
HTML
PDF
Yonatan Belinkov, Yonatan Bisk
TL;DR
本文研究了基于字符的神经机器翻译模型,并发现它们能够解决词表外的问题、学习词形变化,但是在面对嘈杂的数据时容易出现错误。作者探究了两个方法来提高模型的鲁棒性:结构不变的词表示和在噪声数据上强化训练。作者发现一个基于字符卷积神经网络的模型能够同时学习多种噪声下的鲁棒表示。
Abstract
Character-based
neural machine translation
(NMT) models alleviate out-of-vocabulary issues, learn
morphology
, and move us closer to completely end-to-end translation systems. Unfortunately, they are also very bri
→