BriefGPT.xyz
Oct, 2023
可训练的等效转换:用于LLMs的量化
TEQ: Trainable Equivalent Transformation for Quantization of LLMs
HTML
PDF
Wenhua Cheng, Yiyang Cai, Kaokao Lv, Haihao Shen
TL;DR
这篇论文介绍了一种可训练的等价转换方法,能够在保持模型输出的FP32精度的情况下,利用低精度量化,特别是3位和4位的权重量化来满足现代架构的计算需求,该方法在训练过程中轻量级且对推断过程没有计算开销,与当前最先进方法的结果相媲美,并可与其他方法结合以获得更好的性能。
Abstract
As
large language models
(LLMs) become more prevalent, there is a growing need for new and improved
quantization methods
that can meet the computationalast layer demands of these modern architectures while mainta
→