BriefGPT.xyz
Jul, 2020
可微联合剪枝和量化提高硬件效率
Differentiable Joint Pruning and Quantization for Hardware Efficiency
HTML
PDF
Ying Wang, Yadong Lu, Tijmen Blankevoort
TL;DR
我们提出了一种可微的联合剪枝和量化(DJPQ)方案,将神经网络压缩视为联合基于梯度的优化问题,在模型剪枝和量化之间自动地进行权衡,以实现硬件效率,相比之下,我们的方法使用户能够在单个培训过程中找到两者之间的最佳权衡。
Abstract
We present a differentiable joint pruning and
quantization
(DJPQ) scheme. We frame
neural network compression
as a joint
gradient-based optimizat
→