BriefGPT.xyz
Oct, 2019
利用蒸馏非线性神经分解改进词嵌入分解以进行压缩
Distilled embedding: non-linear embedding factorization using knowledge distillation
HTML
PDF
Vasileios Lioutas, Ahmad Rashid, Krtin Kumar, Md Akmal Haidar, Mehdi Rezagholizadeh
TL;DR
本文介绍了一种基于低秩矩阵分解和知识蒸馏的输入/输出嵌入压缩方法,提出的方法简单易实现,具有更高的BLEU分数和更低的语言模型困惑度,适用于机器翻译和语言建模。
Abstract
word-embeddings
are a vital component of Natural Language Processing (
nlp
) systems and have been extensively researched. Better representations of words have come at the cost of huge memory footprints, which has
→