Vision transformers (ViTs) have garnered significant attention for their performance in vision tasks; however, the high computational cost and significant latency issues have hinder widespread adoption. Post-training quantization (PTQ), a promising method for model compression, still faces accuracy degradation challenges with ViTs. There are two reasons for this: the existing quantization paradigm does not fit the power-law distribution of post-Softmax activations well, and accuracy inevitably decreases after reparameterizing post-LayerNorm activations. We propose a Distribution-Friendly and Outlier-Aware Post-training Quantization method for Vision Transformers, named DopQ-ViT. DopQ-ViT analyzes the inefficiencies of current quantizers and introduces a distribution-friendly Tan Quantizer called TanQ. TanQ focuses more on values near 1, more accurately preserving the power-law distribution of post-Softmax activations, and achieves favorable results. Moreover, when reparameterizing post-LayerNorm activations from channel-wise to layer-wise quantization, the accuracy degradation is mainly due to the significant impact of outliers in the scaling factors. Therefore, DopQ-ViT proposes a method to Search for the Optimal Scaling Factor, denoted as SOSF, which compensates for the influence of outliers and preserves the performance of the quantization model. DopQ-ViT has undergone extensive validation and demonstrates significant performance improvements in quantization models, particularly in low-bit settings.

本研究针对视觉变换器（ViTs）在后训练量化（PTQ）中面临的准确性下降问题，提出了一种新的量化方法DopQ-ViT。该方法引入了分布友好的Tan量化器（TanQ）和优化的缩放因子搜索（SOSF），有效解决了后Softmax激活的能力法则分布适应性不足和 LayerNorm 后激活的异常值影响，从而显著提升了低比特设置下的量化模型性能。

面向分布友好和异常值感知的后训练量化方法：DopQ-ViT