BriefGPT.xyz
Jun, 2022
弹弓机制: 自适应优化器和领悟现象的实证研究
The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the \emph{Grokking Phenomenon}
HTML
PDF
Vimal Thilak, Etai Littwin, Shuangfei Zhai, Omid Saremi, Roni Paiss...
TL;DR
本文旨在通过一系列实证研究揭示Grokking现象的基础原理,并发现了一个被称为弹弓机制的适应性优化器优化异常,该异常是Grokking现象的一个显著表现。
Abstract
The \emph{
grokking
phenomenon} as reported by Power et al.~\cite{power2021grokking} refers to a regime where a long period of
overfitting
is followed by a seemingly sudden transition to perfect
→