BriefGPT.xyz
Apr, 2019
线性神经网络中离散梯度动态的隐式正则化
Implicit Regularization of Discrete Gradient Dynamics in Deep Linear Neural Networks
HTML
PDF
Gauthier Gidel, Francis Bach, Simon Lacoste-Julien
TL;DR
本文研究了过参数化模型的离散梯度动态,并证明在使用适当超参数和初始化条件时,该动态可以学习降低秩的回归问题的解。
Abstract
When optimizing
over-parameterized models
, such as
deep neural networks
, a large set of parameters can achieve zero training error. In such cases, the choice of the
→