使用多个子词候选项改进神经网络翻译模型的子词正则化

Apr, 2018

使用多个子词候选项改进神经网络翻译模型的子词正则化

Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates

Taku Kudo

TL;DR介绍了一种利用subword segmentation的噪声来提高神经机器翻译鲁棒性的正则化方法，主要应用于低资源领域。

Abstract

subword units are an effective way to alleviate the open vocabulary problems in neural machine translation (NMT). While sentences are usually converted into unique subword sequences, →