BriefGPT.xyz
Mar, 2024
量化DNN时代的魔法
Magic for the Age of Quantized DNNs
HTML
PDF
Yoshihide Sawada, Ryuji Saiin, Kazuma Suetake
TL;DR
本文提出了一种量化感知训练的方法,通过引入一种独立于小批量大小的新型规范化(Layer-Batch Normalization)和标准化权重的缩放环夹函数对权重进行量化,同时对激活函数使用同样的函数进行量化,并应用替代梯度来训练模型,实验证明我们的量化方法可以在最小的准确性降低下实现。
Abstract
Recently, the number of parameters in
dnns
has explosively increased, as exemplified by LLMs (Large Language Models), making inference on small-scale computers more difficult.
model compression
technology is, the
→