Multi-layer neural networks are among the most powerful models in machine
learning, yet the fundamental reasons for this success defy mathematical
understanding. Learning a neural network requires to optimize a non-convex
high-dimensional objective (risk function), a problem which is u