Fine-tuning large language models (LLMs) on downstream tasks requires substantial computational resources. A class of parameter-efficient fine-tuning (PEFT) aims to mitigate these computational challenges by selectively fine-tuning only a small fraction of the model parameters. Although computationally efficient, these techniques often fail to match the performance of fully fine-tuned models, primarily due to inherent biases introduced during parameter selection. Traditional selective PEFT techniques use a fixed set of parameters based on a predefined budget (a process also known as unmasking), failing to capture parameter importance dynamically and often ending up exceeding the budget. We introduce $\text{ID}^3$, a novel selective PEFT method that calculates parameter importance continually and dynamically unmasks parameters by balancing exploration and exploitation in parameter selection. Our empirical study on 15 tasks spanning natural language understanding and generative tasks demonstrates the effectiveness of our method compared to fixed-masking-based PEFT techniques. We analytically show that $\text{ID}^3$ reduces the number of gradient updates by a factor of two, enhancing computational efficiency. $\text{ID}^3$ is robust to random initialization of neurons and, therefore, can be seamlessly integrated into existing additive and reparametrization-based PEFT modules such as adapters and LoRA for dynamic sparsification.

本文研究了解决大语言模型微调时计算资源消耗大的问题。提出了一种名为ID³的新方法，通过动态评估参数重要性，以平衡探索与利用，对参数进行选择性微调，显著提升了微调效率与性能。研究表明，ID³能够将梯度更新次数减少一半，从而提高计算效率，并能与现有框架无缝集成。

逐步解封高效参数微调大语言模型