BriefGPT.xyz
May, 2024
SliM-LLM:面向大型语言模型的显著性驱动混合精度量化
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
HTML
PDF
Wei Huang, Haotong Qin, Yangdong Liu, Yawei Li, Xianglong Liu...
TL;DR
本文提出了一种基于显著性驱动的混合精度量化方案,即SliM-LLM,用于改进大型语言模型的精度和内存占用,并通过集成梯度量化器进一步减少困惑度。
Abstract
large language models
(LLMs) achieve remarkable performance in natural language understanding but require substantial computation and memory resources.
post-training quantization
(PTQ) is a powerful compression t
→