Hessian-free (HF) optimization has been successfully used for training deep
autoencoders and recurrent networks. HF uses the conjugate gradient algorithm
to construct update directions through curvature-vector products that can be
computed on the same order of time as gradients. In thi