BriefGPT.xyz
Jun, 2023
大语言模型中的权重量化激活异常值教训
OWQ: Lessons learned from activation outliers for weight quantization in large language models
HTML
PDF
Changhun Lee, Jungyu Jin, Taesu Kim, Hyungjun Kim, Eunhyeok Park
TL;DR
本文提出了一种后训练量化方法,可以在不损失质量的情况下,在模型中针对weight使用较高的精度,大大降低了模型推理需要的GPU数量,实现了更高的经济性。
Abstract
large language models
(LLMs) with hundreds of billions of parameters show impressive results across various language tasks using simple prompt tuning and few-shot examples, without the need for
task-specific fine-tuning
→