BriefGPT.xyz
Oct, 2024
ARB-LLM:大语言模型的交替精细二元化
ARB-LLM: Alternating Refined Binarizations for Large Language Models
HTML
PDF
Zhiteng Li, Xianglong Yan, Tianao Zhang, Haotong Qin, Dong Xie...
TL;DR
本研究针对大语言模型(LLM)在实际应用中面临的高内存和计算需求问题,提出了一种新颖的后训练量化技术ARB-LLM。通过交替精细二元化算法,研究有效缩小了二元权重与全精度权重之间的分布差距,并引入列偏差的处理策略,从而实现了对现有二元化方法的显著性能提升。
Abstract
Large Language Models
(LLMs) have greatly pushed forward advancements in natural language processing, yet their high memory and computational demands hinder practical deployment.
Binarization
, as an effective com
→