BriefGPT.xyz
Jun, 2024
一种改进的经验费歇近似方法用于自然梯度下降
An Improved Empirical Fisher Approximation for Natural Gradient Descent
HTML
PDF
Xiaodong Wu, Wenyi Yu, Chao Zhang, Philip Woodland
TL;DR
通过提出改进的经验Fisher(iEF)方法,该论文研究了近似自然梯度下降(NGD)方法中经验Fisher信息矩阵的逆比例缩放问题,并在实验中评估了该方法的性能,在参数高效微调、深度学习优化等方面取得了较好的收敛性和拟合能力。
Abstract
approximate natural gradient descent
(NGD) methods are an important family of optimisers for deep learning models, which use approximate Fisher information matrices to pre-condition gradients during training. The
empiri
→