BriefGPT.xyz
May, 2023
LLM-QAT: 大型语言模型的无数据量化感知训练
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
HTML
PDF
Zechun Liu, Barlas Oguz, Changsheng Zhao, Ernie Chang, Pierre Stock...
TL;DR
通过提出一种数据无关的蒸馏方法,利用预训练模型生成的结果来实现对语言模型低位量化,包括权重、激活值和KV Cache,该方法比已有的基于训练后量化和无训练量化方法更适用于低精度位级下的大型语言模型。
Abstract
Several
post-training quantization
methods have been applied to large language models (LLMs), and have been shown to perform well down to 8-bits. We find that these methods break down at lower bit precision, and investigate
→