BriefGPT.xyz
Apr, 2023
RPTQ: 基于重排序的后训练量化方法用于大型语言模型
RPTQ: Reorder-based Post-training Quantization for Large Language Models
HTML
PDF
Zhihang Yuan, Lin Niu, Jiawei Liu, Wenyu Liu, Xinggang Wang...
TL;DR
本文提出了一种新的基于重新排序的量化方法 RPTQ,用于解决大规模语言模型序列的激活范围之间的不同,从而将其缩小到 3 位激活,减少存储和计算的开销。
Abstract
large-scale language models
(LLMs) have demonstrated outstanding performance on various tasks, but their deployment poses challenges due to their enormous
model size
. In this paper, we identify that the main chal
→