LCQ: 基于低秩码本的大语言模型量化

May, 2024

LCQ: 基于低秩码本的大语言模型量化

LCQ: Low-Rank Codebook based Quantization for Large Language Models

Wen-Pu Cai, Wu-Jun Li

TL;DR大型语言模型在许多任务中展现出有希望的性能，然而，高存储和计算成本成为部署大型语言模型的挑战。本文提出了一种新的称为基于低秩码簿的量化方法（LCQ）用于大型语言模型的权重量化，通过采用秩大于一的低秩码簿，LCQ在基本不增加存储成本的情况下能够取得比现有方法更好的准确性。

Abstract

large language models~(LLMs) have recently demonstrated promising performance in many tasks. However, the high storage and computational cost of LLMs has become a challenge for deploying LLMs. weight quantization