BriefGPT.xyz
Apr, 2023
利用知识蒸馏压缩多语言神经机器翻译模型的实证研究
An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models
HTML
PDF
Varun Gumma, Raj Dabre, Pratyush Kumar
TL;DR
本文探讨了如何通过知识蒸馏来压缩 MNMT 模型,发现这是一项具有挑战性的任务,并提出了一些设计思考和优化方案。
Abstract
knowledge distillation
(KD) is a well-known method for compressing neural models. However, works focusing on distilling knowledge from large
multilingual neural machine translation
(MNMT) models into smaller ones
→