BriefGPT.xyz
Jan, 2025
浮点量化训练的规模规律
Scaling Laws for Floating Point Quantization Training
HTML
PDF
Xingwu Sun, Shuaipeng Li, Ruobing Xie, Weidong Han, Kan Wu...
TL;DR
本研究针对现有浮点量化训练研究相对肤浅的问题,提出了一种全面探讨浮点量化目标、指数位、尾数位等因素对大规模语言模型(LLM)训练性能影响的新方法。研究发现,浮点量化的最佳精度与计算能力成正比,并且对于不同位数提供了优化的指数-尾数比,为硬件制造商提供了参考。
Abstract
Low-Precision Training
is considered an effective strategy for reducing both training and downstream inference costs. Previous
Scaling Laws
for precision mainly focus on integer
→