BriefGPT.xyz
Jul, 2024
视觉Transformer后训练量化的错误减少
ERQ: Error Reduction for Post-Training Quantization of Vision Transformers
HTML
PDF
Yunshan Zhong, Jiawei Hu, You Huang, Yuxin Zhang, Rongrong Ji
TL;DR
提出了ERQ,一种两步PTQ方法,通过优化激活和权重量化,逐步减小量化误差,并在W3A4 ViT-S模型上的准确性上超过了最先进的GPTQ方法22.36%。
Abstract
post-training quantization
(PTQ) for
vision transformers
(ViTs) has garnered significant attention due to its efficiency in compressing models. However, existing methods typically overlook the intricate interdepe
→