BriefGPT.xyz
Apr, 2019
循环神经网络语言建模的知识蒸馏与信任正则化
Knowledge Distillation For Recurrent Neural Network Language Modeling With Trust Regularization
HTML
PDF
Yangyang Shi, Mei-Yuh Hwang, Xin Lei, Haoyu Sheng
TL;DR
本文通过应用知识蒸馏和信任正则化方法来减小模型大小,从而降低了循环神经网络(RNN)语言模型的计算成本,并保持了该模型在Penn Treebank数据集上的最先进困惑度结果,同时在声音识别任务中没有降低单词错误率(WER)。
Abstract
recurrent neural networks
(RNNs) have dominated
language modeling
because of their superior performance over traditional N-gram based models. In many applications, a large Recurrent Neural Network language model
→