We consider the problem of training a multi-layer over-parametrized neural
networks to minimize the empirical risk induced by a loss function. In the
typical setting of over-parametrization, the network width $m$ is much larger
than the data dimension $d$ and number of training samples
本文研究了神经网络学习中超参数化的有效性,提出了一种使用局部搜索算法寻找全局最优解的方法,并使用 Rademacher 复杂性理论证明了在权重衰减的情况下,解决方案在数据采样自正态分布等正则分布的情况下也能很好地推广,同时还分析了具有二次激活函数和 n 个训练数据点的 k 个隐藏节点浅层网络的本质性质。