BriefGPT.xyz
Oct, 2022
对称性,平坦极小值,以及梯度流守恒量
Symmetries, flat minima, and the conserved quantities of gradient flow
HTML
PDF
Bo Zhao, Iordan Ganev, Robin Walters, Rose Yu, Nima Dehmamy
TL;DR
通过使用激活函数的同变性并将其推广到非线性神经网络,找到了一些全局最小值的低误差谷,该方法可以提高鲁棒性,并提供了有关初始化影响的见解。
Abstract
Empirical studies of the
loss landscape
of
deep networks
have revealed that many local minima are connected through
low-loss valleys
. Ense
→