We present a novel network pruning algorithm called Dynamic Sparse Training that can jointly find the optimal network parameters and sparse network structure in a unified optimization process with trainable pruning thresholds. These thresholds can have fine-grained layer-wise adjustments dynamically via backpropagation. We demonstrate that our dynamic sparse training algorithm can easily train very sparse neural network models with little performance loss using the same number of training epochs as dense models. Dynamic Sparse Training achieves the state of the art performance compared with other sparse training algorithms on various network architectures. Additionally, we have several surprising observations that provide strong evidence for the effectiveness and efficiency of our algorithm. These observations reveal the underlying problems of traditional three-stage pruning algorithms and present the potential guidance provided by our algorithm to the design of more compact network architectures.

本文介绍了一种新的神经网络剪枝算法——Dynamic Sparse Training，它可以通过可训练的剪枝门限实现优化神经网络参数和结构，并通过反向传播动态地进行精细化调整。利用这一算法，我们可以轻松训练出效果优秀的稀疏神经网络。与其他稀疏训练算法相比，Dynamic Sparse Training在多个网络架构上取得了业界领先水平。此外，我们还发现了传统三阶段剪枝算法的潜在问题，为更紧凑的神经网络架构设计提供了理论指导。

动态稀疏训练：通过可训练掩码层从头开始找出高效的稀疏网络