BriefGPT.xyz
Dec, 2017
深度学习理论III:解释非过拟合谜题
Theory of Deep Learning III: explaining the non-overfitting puzzle
HTML
PDF
Tomaso Poggio, Kenji Kawaguchi, Qianli Liao, Brando Miranda, Lorenzo Rosasco...
TL;DR
该研究探讨深度网络中的过拟合问题,发现梯度下降在非线性网络中的优化动力学与线性系统是等价的,同时也推广了梯度下降的两个性质到非线性网络中:隐式正则化以及最小范数解的渐近收敛,通过这些性质,可以提高模型的泛化能力,同时在分类任务中也能得到较好的分类误差。
Abstract
A main puzzle of
deep networks
revolves around the absence of
overfitting
despite overparametrization and despite the large capacity demonstrated by zero training error on randomly labeled data. In this note, we
→