BriefGPT.xyz
May, 2018
长短期记忆作为动态计算的逐元素加权和
Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum
HTML
PDF
Omer Levy, Kenton Lee, Nicholas FitzGerald, Luke Zettlemoyer
TL;DR
通过解耦LSTM门控机制,作者提出了一种新的RNN类型,其中门控机制本身作为一种通用的循环模型,提供了比之前更强的表达能力,并且实验表明,门控机制单独在大多数情况下的表现不亚于LSTM,强烈暗示门控机制在实践中做得比消除消失梯度更多。
Abstract
lstms
were introduced to combat
vanishing gradients
in simple RNNs by augmenting them with gated additive recurrent connections. We present an alternative view to explain the success of
→