BriefGPT.xyz
Feb, 2021
探索机器翻译中的监督和无监督奖励
Exploring Supervised and Unsupervised Rewards in Machine Translation
HTML
PDF
Julia Ive, Zixu Wang, Marina Fomicheva, Lucia Specia
TL;DR
提出了两种方法来使机器翻译系统对训练中使用的度量函数的依赖性降低,一种是熵正则化RL方法,另一种是探索动态无监督奖励函数的新的RL方法,这些方法可改善机器翻译的质量和泛化性能,同时减少BLEU奖励函数对参考文本中所使用的单词的依赖。
Abstract
reinforcement learning
(RL) is a powerful framework to address the discrepancy between loss functions used during training and the final evaluation metrics to be used at test time. When applied to
neural machine transla
→