BriefGPT.xyz
Nov, 2024
BitNet a4.8:1位大型语言模型的4位激活
BitNet a4.8: 4-bit Activations for 1-bit LLMs
HTML
PDF
Hongyu Wang, Shuming Ma, Furu Wei
TL;DR
本研究解决了1位大型语言模型(LLM)中推理成本高且性能下降的问题。通过引入BitNet a4.8,采用混合量化和稀疏化策略,实现在注意力和前馈网络层中使用4位激活,并对中间状态进行稀疏化,经过大量实验,证明其推理速度更快且与BitNet b1.58相当的性能,提高了大型LLM的效率。
Abstract
Recent research on the
1-bit Large Language Models
(LLMs), such as BitNet b1.58, presents a promising direction for reducing the inference cost of LLMs while maintaining their performance. In this work, we introduce BitNet a4.8, enabling
→