BriefGPT.xyz
May, 2023
高阶L2正则化的线性DNN中隐式SGD偏差:由高到低秩的单向跳跃
Implicit bias of SGD in $L_{2}$-regularized linear DNNs: One-way jumps from high to low rank
HTML
PDF
Zihan Wang, Arthur Jacot
TL;DR
通过 SGD 算法,在一定概率下可以从高秩极小值跳到低秩极小值,但跳回去的概率为零,在矩阵补全任务中,目标是收敛到最小秩的局部极小值。
Abstract
The $L_{2}$-regularized loss of
deep linear networks
(DLNs) with more than one hidden layers has multiple
local minima
, corresponding to matrices with different ranks. In tasks such as
→