随机梯度下降学习带有非线性激活函数的状态方程

Sep, 2018

随机梯度下降学习带有非线性激活函数的状态方程

Stochastic Gradient Descent Learns State Equations with Nonlinear Activations

Samet Oymak

TL;DR本文研究离散时间动力系统与递归神经网络，提出了一种基于随机梯度下降的权重矩阵学习方法，并证明了其近乎最优的样本大小和线性收敛性，适用于激活函数的导数远离零的情形。同时，进行了数值实验以验证理论的正确性。

Abstract

We study discrete time dynamical systems governed by the state equation $h_{t+1}=\phi(Ah_t+Bu_t)$. Here $A,B$ are weight matrices, $\phi$ is an activation function, and $u_t$ is the input data. This relation is the backbone of →