BriefGPT.xyz
Nov, 2023
PIPE:并行推理通过后训练量化残差扩展集成
PIPE : Parallelized Inference Through Post-Training Quantization Ensembling of Residual Expansions
HTML
PDF
Edouard Yvinec, Arnaud Dapogny, Kevin Bailly
TL;DR
通过将浮点操作转换为较低位宽格式,基于残差误差扩展、群组稀疏性和集成逼近的PIPE量化方法,在每个基准应用、架构和位宽上实现卓越性能,从深度神经网络到自然语言处理,从int8到三进制量化,并满足数据隐私需求。
Abstract
deep neural networks
(DNNs) are ubiquitous in computer vision and natural language processing, but suffer from high inference cost. This problem can be addressed by
quantization
, which consists in converting floa
→