BriefGPT.xyz
Jun, 2024
高效神经压缩与推理时间解码
Efficient Neural Compression with Inference-time Decoding
HTML
PDF
C. Metz, O. Bichler, A. Dupret
TL;DR
通过混合精度量化、零点量化和熵编码将Resnets的压缩边界推进到1位以外,在ImageNet基准测试中准确度下降不超过1%。
Abstract
This paper explores the combination of
neural network quantization
and
entropy coding
for
memory footprint minimization
. Edge deployment o
→