BriefGPT.xyz
Feb, 2021
稠密网络的价格变得稀疏:通过子空间偏移改善稀疏初始化网络的性能
Dense for the Price of Sparse: Improved Performance of Sparsely Initialized Networks via a Subspace Offset
HTML
PDF
Ilan Price, Jared Tanner
TL;DR
介绍了一种新的DCT加稀疏层架构,即使只剩下0.01%可训练的核参数,也能保持信息传递和可训练性;同时,此种新架构用於精简网络在初始化後的训练可达到极端稀疏度时的最高准确性。
Abstract
That
neural networks
may be pruned to high sparsities and retain high accuracy is well established. Recent research efforts focus on
pruning
immediately after initialization so as to allow the computational savin
→