Amitay Bar, Rotem Mulayoff, Tomer Michaeli, Ronen Talmon
TL;DR预处理 Langevin 动力学在目标函数的稳态点附近的期望损失与目标函数的 Hessian 排名成正比,并在神经网络中的应用中比较了类似 SGD 和类似 Adam 的预处理器的期望损失。
Abstract
langevin dynamics (LD) is widely used for sampling from distributions and for optimization. In this work, we derive a closed-form expression for the expected loss of →