BriefGPT.xyz
Jun, 2023
基于SGLD的信息准则和过参数化区域
SGLD-Based Information Criteria and the Over-Parameterized Regime
HTML
PDF
Haobo Chen, Yuheng Bu, Gregory W. Wornell
TL;DR
通过信息风险最小化框架更新信息准则定理的分析,提供了基于随机梯度 langevin 动力学的模型的赤池信息准则和贝叶斯信息准则,并将信息论分析扩展到了过参数化模型中。
Abstract
double-descent
refers to the unexpected drop in test loss of a learning algorithm beyond an interpolating threshold with over-parameterization, which is not predicted by
information criteria
in their classical fo
→