BriefGPT.xyz
Nov, 2018
随机梯度下降优化超参数化的深度ReLU网络
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
HTML
PDF
Difan Zou, Yuan Cao, Dongruo Zhou, Quanquan Gu
TL;DR
研究如何使用ReLU激活函数、梯度下降和随机梯度下降来训练深度神经网络,证明在一定条件下,充分的随机权重初始化能够让这些方法在超参数化的深层ReLU网络上达到全局最小值。
Abstract
We study the problem of training
deep neural networks
with Rectified Linear Unit (ReLU) activiation function using
gradient descent
and stochastic
→