BriefGPT.xyz
May, 2024
LCQ: 基于低秩码本的大语言模型量化
LCQ: Low-Rank Codebook based Quantization for Large Language Models
HTML
PDF
Wen-Pu Cai, Wu-Jun Li
TL;DR
大型语言模型在许多任务中展现出有希望的性能,然而,高存储和计算成本成为部署大型语言模型的挑战。本文提出了一种新的称为基于低秩码簿的量化方法(LCQ)用于大型语言模型的权重量化,通过采用秩大于一的低秩码簿,LCQ在基本不增加存储成本的情况下能够取得比现有方法更好的准确性。
Abstract
large language models
~(LLMs) have recently demonstrated promising performance in many tasks. However, the high storage and computational cost of LLMs has become a challenge for deploying LLMs.
weight quantization
→