BriefGPT.xyz
Mar, 2020
弱梯度和强梯度方向:解释尺度下的记忆、推广和难度
Explaining Memorization and Generalization: A Large-Scale Study with Coherent Gradients
HTML
PDF
Piotr Zielinski, Shankar Krishnan, Satrajit Chatterjee
TL;DR
本文通过对 ResNet,Inception 和 VGG 等模型的实验验证了相干梯度假设,并提出了具有可扩展性的抑制弱梯度方向的方法,这是首次令当代的监督学习提供令人信服的概括能力证据。
Abstract
coherent gradients
is a recently proposed hypothesis to explain why
over-parameterized neural networks
trained with
gradient descent
→