BriefGPT.xyz
Oct, 2023
大语言模型推理加速的稀疏微调
Sparse Finetuning for Inference Acceleration of Large Language Models
HTML
PDF
Eldar Kurtic, Denis Kuznedelev, Elias Frantar, Michael Goin, Dan Alistarh
TL;DR
我们研究了大型语言模型的精确稀疏微调问题,通过引入稀疏权重在专门的任务上微调预训练的语言模型。我们提出了一种称为SquareHead的基于L2范数的蒸馏方法,能够在高稀疏率下实现准确恢复,并展示了稀疏语言模型在CPU和GPU执行中的速度提升。
Abstract
We consider the problem of
accurate sparse finetuning
of
large language models
(LLMs), that is, finetuning pretrained LLMs on specialized tasks, while inducing
→