BriefGPT.xyz
Sep, 2019
有限宽度和输入维度的深度ReLU网络中的学生专业化
Over-parameterization as a Catalyst for Better Generalization of Deep ReLU network
HTML
PDF
Yuandong Tian
TL;DR
本文提出了一种通过梯度下降法训练 ReLU / Leaky ReLU 模型的方法,以实现两层和多层神经网络的节点专业化,证明了在适当的数据集和网络间条件下,该模型可实现特定形式的数据增强,获得固定大小的样本集,并展现出神经元节点的最小化分歧、所需最低的梯度量级和训练阶段中的归纳偏差。
Abstract
To analyze deep
relu
network, we adopt a student-teacher setting in which an over-parameterized student network learns from the output of a fixed teacher network of the same depth, with
stochastic gradient descent
→