BriefGPT.xyz
Nov, 2023
神经网络训练中的普适锐度动态:固定点分析,稳定边缘和混沌路径
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
HTML
PDF
Dayal Singh Kalra, Tianyu He, Maissam Barkeshli
TL;DR
通过对一种简化的2层线性网络模型的分析,我们揭示了梯度下降动力学中锐度现象背后的机制,包括锐度降低、渐进锐化和稳定边缘等,该模型的预测在实际场景中也具有普遍适用性。
Abstract
In
gradient descent dynamics
of neural networks, the top eigenvalue of the
hessian of the loss
(
sharpness
) displays a variety of robust ph
→