BriefGPT.xyz
Jun, 2024
高级异常管理和低延迟闪存器件的高效旋转和置换
Rotation and Permutation for Advanced Outlier Management and Efficient Quantization of LLMs
HTML
PDF
Haokun Lin, Haobo Xu, Yichen Wu, Jingzhi Cui, Yingtao Zhang...
TL;DR
本研究提出了一种创新的量化策略——DuQuant,采用旋转和置换变换更有效地消除异常激活,并在多个任务中表现出卓越的异常值管理能力,即使在4位权重-激活量化下也能取得顶级结果。
Abstract
quantizing
large language models
(LLMs) presents significant challenges, primarily due to
outlier activations
that compromise the efficien
→