BriefGPT.xyz
May, 2019
自然梯度下降的经验Fisher近似限制
Limitations of the Empirical Fisher Approximation
HTML
PDF
Frederik Kunstner, Lukas Balles, Philipp Hennig
TL;DR
本文争议了近似二阶方法和启发式算法如Adam之间的关系,并指出实证Fisher不像Fisher一样普遍捕捉到二阶信息,并且在简单的优化问题中,实证Fisher的病理可以产生不良影响。
Abstract
natural gradient descent
, which preconditions a gradient descent update with the
fisher information matrix
of the underlying statistical model, is a way to capture partial second-order information. Several highly
→