BriefGPT.xyz
Oct, 2020
神经机器翻译的Token级自适应训练
Token-level Adaptive Training for Neural Machine Translation
HTML
PDF
Shuhao Gu, Jinchao Zhang, Fandong Meng, Yang Feng, Wanying Xie...
TL;DR
本文研究了神经机器翻译中的令牌不平衡现象,并提出采用基于目标令牌频率的目标令牌级自适应目标来训练模型,以提高翻译质量和提高翻译词汇的多样性。结果表明,与基线相比,在包含更多低频词汇的句子中,分别可以获得1.68,1.02和0.52的BLEU增益。
Abstract
There exists a
token imbalance
phenomenon in natural language as different tokens appear with different frequencies, which leads to different learning difficulties for tokens in
neural machine translation
(NMT).
→