The complicated architecture and high training cost of vision transformers urge the exploration of post-training quantization. However, the heavy-tailed distribution of vision transformer activations hinders the effectiveness of previous post-training quantization methods, even with advanced quantizer designs. Instead of tuning the quantizer to better fit the complicated activation distribution, this paper proposes NoisyQuant, a quantizer-agnostic enhancement for the post-training activation quantization performance of vision transformers. We make a surprising theoretical discovery that for a given quantizer, adding a fixed Uniform noisy bias to the values being quantized can significantly reduce the quantization error under provable conditions. Building on the theoretical insight, NoisyQuant achieves the first success on actively altering the heavy-tailed activation distribution with additive noisy bias to fit a given quantizer. Extensive experiments show NoisyQuant largely improves the post-training quantization performance of vision transformer with minimal computation overhead. For instance, on linear uniform 6-bit activation quantization, NoisyQuant improves SOTA top-1 accuracy on ImageNet by up to 1.7%, 1.1% and 0.5% for ViT, DeiT, and Swin Transformer respectively, achieving on-par or even higher performance than previous nonlinear, mixed-precision quantization.

NoisyQuant 是一种用于视觉变换器 post-training activation 量化性能增强的量化器不可知增强方法。它的理论是，在给定量化器的情况下，添加一个固定的均匀噪声偏差可以在可证明的条件下显着降低量化误差。基于这个理论，NoisyQuant 成功地通过添加增量噪声偏差来改变重尾激活分布并适应给定的量化器。大量实验展示了 NoisyQuant 在使视觉变换器进行 post-training quantization 时可以大幅度提高性能，而且计算成本较小。

NoisyQuant：针对视觉Transformer的噪声偏置增强后训练激活量化