BriefGPT.xyz
Jul, 2020
深度网络中的早停法:双重下降及其消除方法
Early Stopping in Deep Networks: Double Descent and How to Eliminate it
HTML
PDF
Reinhard Heckel, Fatih Furkan Yilmaz
TL;DR
本文探讨了过度参数化模型,特别是深度神经网络,在训练期间错误率的演化现象,其原因是来自于不同部分在不同时期学习带来的偏差-方差权衡嵌套问题。通过合理调整步长,可以显著提高早停指标。
Abstract
over-parameterized models
, in particular deep networks, often exhibit a
double descent phenomenon
, where as a function of model size, error first decreases, increases, and decreases at last. This intriguing doubl
→