BriefGPT.xyz
Jun, 2017
超参数神经网络海森矩阵的实证分析
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
HTML
PDF
Levent Sagun, Utku Evci, V. Ugur Guney, Yann Dauphin, Leon Bottou
TL;DR
我们研究了常见损失曲面的性质,并针对深度学习,通过Hessian矩阵的谱将其分为两个部分,并证明了Sagun等人所述的猜想。我们的观察结果对高维度的非凸优化具有重要意义,并提出了新的基于超参数冗余的几何角度视角。
Abstract
We study the properties of common loss surfaces through their
hessian matrix
. In particular, in the context of
deep learning
, we empirically show that the spectrum of the Hessian is composed of two parts: (1) the
→