BriefGPT.xyz
Dec, 2023
精调预训练大型语言模型中的稀疏是足够的
Sparse is Enough in Fine-tuning Pre-trained Large Language Model
HTML
PDF
Weixi Song, Zuchao Li, Lefei Zhang, Hai Zhao, Bo Du
TL;DR
通过研究下游领域的损失函数从随机初始化到预训练初始化的变换,本文揭示了参数梯度稀疏性的特性,提出了基于梯度的稀疏微调算法Sparse Increment Fine-Tuning (SIFT),并在多个任务上验证了其有效性。
Abstract
With the prevalence of
pre-training-fine-tuning paradigm
, how to efficiently adapt the pre-trained model to the downstream tasks has been an intriguing issue.
parameter-efficient fine-tuning
(PEFT) methods have b
→