BriefGPT.xyz
May, 2023
神经机器翻译知识蒸馏理解与改进探究
Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation
HTML
PDF
Songming Zhang, Yunlong Liang, Shuaibo Wang, Wenjuan Han, Jian Liu...
TL;DR
本文研究神经机器翻译中知识蒸馏的技术,发现知识来源于教师的top-1预测,进一步提出一种名为TIE-KD的方法用于增强知识蒸馏,包含了层次排序损失和迭代蒸馏等措施,实验证明TIE-KD优于基准模型,具有更高的潜力和泛化性能。
Abstract
knowledge distillation
(KD) is a promising technique for
model compression
in
neural machine translation
. However, where the knowledge hid
→