BriefGPT.xyz
May, 2023
ADLER -- 一种基于海森矩阵的自适应学习速率策略
ADLER -- An efficient Hessian-based strategy for adaptive learning rate
HTML
PDF
Dario Balboni, Davide Bacciu
TL;DR
本研究基于深度模型,提供了一种基于局部二次逼近的自适应SGD学习率策略,并将其与格点搜索SDG学习率及Gauss-Newton近似法进行比较。该策略的Hessian矩阵的正半定估计精确度较高,可以在分类任务中对不同结构(有或无残差连接)的卷积神经网络上进行性能评估。
Abstract
We derive a sound positive semi-definite approximation of the
hessian
of
deep models
for which
hessian
-vector products are easily computab
→