深度网络的能量景观

Nov, 2015

Trivializing The Energy Landscape Of Deep Networks

Pratik Chaudhari, Stefano Soatto

TL;DR本文提出了一种名为 AnnealSGD 的正则化随机梯度下降算法，可以通过对一个特定类别的深度网络的能量景观进行分析来获得启示，从而优化损失函数。

Abstract

We study a theoretical model that connects deep learning to finding the ground state of the Hamiltonian of a spherical spin glass. Existing results motivated from statistical physics show that deep networks have a highly non-convex energy landscape with exponentially many local minima and energy barriers beyond which gradient descent algorithms cannot make p