BriefGPT.xyz
Apr, 2023
对角线性网络中的鞍点动态
Saddle-to-Saddle Dynamics in Diagonal Linear Networks
HTML
PDF
Scott Pesme, Nicolas Flammarion
TL;DR
本文探讨了超参数初始化趋近于零时,激活集与损失函数极小值之间的关系,证明了激活集的约束下,梯度流跳跃到另一个鞍点的动态可作为增量学习的过程,并采用类似于Lasso路径计算的Homotopy算法解决了实现上的难点。
Abstract
In this paper we fully describe the trajectory of
gradient flow
over diagonal
linear networks
in the limit of vanishing initialisation. We show that the limiting flow successively jumps from a saddle of the train
→