BriefGPT.xyz
Jun, 2018
朝向二值门的鲁棒LSTM训练
Towards Binary-Valued Gates for Robust LSTM Training
HTML
PDF
Zhuohan Li, Di He, Fei Tian, Wei Chen, Tao Qin...
TL;DR
本研究提出了一种新的LSTM训练方式,使得门控单元的输出更加容易解释,经过实证研究,发现通过将门控单元的输出值推向0或1,可以更好地控制信息流,从而提高模型的泛化能力和压缩率。
Abstract
long short-term memory
(
lstm
) is one of the most widely used recurrent structures in
sequence modeling
. It aims to use
→