深度网络的自然梯度再探

Jan, 2013

Natural Gradient Revisited

Razvan Pascanu, Yoshua Bengio

TL;DR本文研究了使用自然梯度算法在深度学习中的应用以及其与其他三种方法的联系，并提出了使用未标记数据提高自然梯度算法推广误差鲁棒性的新方法，并将自然梯度算法扩展到包括第二阶信息和流形信息。

Abstract

The aim of this paper is two-folded. First we intend to show that hessian-free optimization (Martens, 2010) and krylov subspace descent (Vinyals and Povey, 2012) can be described as implementations of