长短期记忆作为动态计算的逐元素加权和

May, 2018

长短期记忆作为动态计算的逐元素加权和

Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum

Omer Levy, Kenton Lee, Nicholas FitzGerald, Luke Zettlemoyer

TL;DR通过解耦LSTM门控机制，作者提出了一种新的RNN类型，其中门控机制本身作为一种通用的循环模型，提供了比之前更强的表达能力，并且实验表明，门控机制单独在大多数情况下的表现不亚于LSTM，强烈暗示门控机制在实践中做得比消除消失梯度更多。

Abstract

lstms were introduced to combat vanishing gradients in simple RNNs by augmenting them with gated additive recurrent connections. We present an alternative view to explain the success of →