BriefGPT.xyz
Jan, 2019
面向建模长期依赖的非饱和循环单元
Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies
HTML
PDF
Sarath Chandar, Chinnadhurai Sankar, Eugene Vorontsov, Samira Ebrahimi Kahou, Yoshua Bengio
TL;DR
本文提出了一种新的递归神经网络架构NRU,该架构依赖于内存机制,不采用饱和激活函数和饱和门,以进一步减轻消失梯度问题,并在一系列合成和真实世界任务中证明了该模型是与其他架构相比,在具有和不具有长期依赖的所有任务中表现最佳的唯一模型。
Abstract
Modelling
long-term dependencies
is a challenge for
recurrent neural networks
. This is primarily due to the fact that gradients vanish during training, as the sequence length increases. Gradients can be attenuate
→