BriefGPT.xyz
May, 2021
深度神经网络泛化与记忆的几何学
On the geometry of generalization and memorization in deep neural networks
HTML
PDF
Cory Stephenson, Suchismita Padhy, Abhinav Ganesh, Yue Hui, Hanlin Tang...
TL;DR
通过几何分析深度神经网络的 memorization 结构及相关特征,发现较深层的 memorization 更为显著,可以通过恢复层权重预防,同时与模型几何结构和 generalization 性能有关。
Abstract
Understanding how large
neural networks
avoid memorizing training data is key to explaining their high
generalization performance
. To examine the structure of when and where
→