使用ReLU激活函数的循环神经网络的路径归一化优化

May, 2016

使用ReLU激活函数的循环神经网络的路径归一化优化

Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations

Behnam Neyshabur, Yuhuai Wu, Ruslan Salakhutdinov, Nathan Srebro

TL;DR研究了循环神经网络参数空间的几何形状，并开发了一种适应于该几何形状的路径-SGD优化方法，它可以学习具有ReLU激活的普通RNN。在一些需要捕捉长期依赖结构的数据集上，我们证明path-SGD可以显著提高ReLU RNN的可训练性，与使用SGD训练的RNN相比，即使使用各种最近推荐的初始化方案。

Abstract

We investigate the parameter-space geometry of recurrent neural networks (RNNs), and develop an adaptation of path-sgd optimization method, attuned to this geometry, that can learn plain RNNs with →