BriefGPT.xyz
Oct, 2024
带权重衰减训练的宽神经网络显著展现神经崩溃的现象
Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse
HTML
PDF
Arthur Jacot, Peter Súkeník, Zihan Wang, Marco Mondelli
TL;DR
本研究解决了深度神经网络收敛时神经崩溃的理论证明问题,通过研究终止于至少两个线性层的神经网络,而不是之前的无约束特征模型。关键发现是,证明了在带权重衰减的梯度下降训练下,这种神经崩溃现象可以得到普遍性保证,从而为神经网络训练的理解提供了新的视角。
Abstract
Deep Neural Networks
(DNNs) at convergence consistently represent the training data in the last layer via a highly symmetric geometric structure referred to as
Neural Collapse
. This empirical evidence has spurred
→