BriefGPT.xyz
May, 2023
关于激活和标准化对于初始情况下获得等距嵌入的影响
On the impact of activation and normalization in obtaining isometric embeddings at initialization
HTML
PDF
Amir Joudaki, Hadi Daneshmand, Francis Bach
TL;DR
该论文探讨了深度神经网络中的Gram矩阵结构,在多层感知器中给出了层归一化与激活层一起导致Gram矩阵趋向等距的证明,进一步阐明了高阶Hermite系数在此方面的重要性。
Abstract
In this paper, we explore the structure of the penultimate
gram matrix
in
deep neural networks
, which contains the pairwise inner products of outputs corresponding to a batch of inputs. In several architectures i
→