BriefGPT.xyz
May, 2024
实际数据集上理解的进展度量
Progress Measures for Grokking on Real-world Datasets
HTML
PDF
Satvik Golechha
TL;DR
研究探讨了使用深度神经网络进行分类的情况下,现实世界数据集中普遍观察到并研究的综合学习现象及其相关因素,发现权重范数并非导致综合学习的主要原因,而提出的进展度量方法能更好地理解综合学习的动态。
Abstract
grokking
, a phenomenon where
machine learning models
generalize long after overfitting, has been primarily observed and studied in algorithmic tasks. This paper explores
→