BriefGPT.xyz
Feb, 2024
初始化时剪枝的信息论障碍
No Free Prune: Information-Theoretic Barriers to Pruning at Initialization
HTML
PDF
Tanishq Kumar, Kevin Luo, Mark Sellke
TL;DR
彩票模型的存在考虑了深度学习中是否需要大型模型以及是否可以快速识别和训练稀疏网络,而无需训练包含它们的稠密模型。通过对彩票模型的理论解释,揭示了稀疏网络需要依赖于数据的遮罩来稳定插值噪声数据。研究证实了训练过程中获取的信息可以影响模型容量。
Abstract
The existence of "
lottery tickets
" arXiv:1803.03635 at or near initialization raises the tantalizing question of whether large models are necessary in deep learning, or whether
sparse networks
can be quickly iden
→