BriefGPT.xyz
Sep, 2023
激活稀疏性的理论解释:通过平坦极小值和对抗性鲁棒性
Theoretical Explanation of Activation Sparsity through Flat Minima and Adversarial Robustness
HTML
PDF
Ze Peng, Lei Qi, Yinghuan Shi, Yang Gao
TL;DR
基于梯度稀疏性和随机矩阵理论的激活稀疏性,该研究解释了深度模型中激活稀疏性的理论机制以及其在对抗鲁棒性和性能方面的重要性,并提出了几种用于训练和稀疏调整的模块和修改的方法。
Abstract
A recent empirical observation of
activation sparsity
in MLP layers offers an opportunity to drastically reduce computation costs for free. Despite several works attributing it to training dynamics, the theoretical explanation of
→