BriefGPT.xyz
Nov, 2019
非自回归机器翻译中的知识蒸馏理解
Understanding Knowledge Distillation in Non-autoregressive Machine Translation
HTML
PDF
Chunting Zhou, Graham Neubig, Jiatao Gu
TL;DR
本文通过实验发现,知识蒸馏可以降低数据集的复杂度,帮助非自回归机器翻译模型 NART 更好地建模输出的变化,提高翻译质量,并提出多种方法来调整数据集复杂度以改进 NAT 模型的性能,达到了最先进的性能水平。
Abstract
non-autoregressive machine translation
(NAT) systems predict a sequence of output tokens in parallel, achieving substantial improvements in generation speed compared to autoregressive models. Existing
nat models
→