Though Large Language Models (LLMs) have demonstrated the powerful
capabilities of few-shot learning through prompting methods, supervised
training is still necessary for complex reasoning tasks. Because of their
extensive parameters and memory consumption, both Parameter-Efficient
Fine-Tuning (PEFT) methods and Memory-Efficient Fine-Tuning methods have been
proposed for LLMs. Nevertheless, the issue of large annotated data consumption,
the aim of Data-Efficient Fine-Tuning, remains unexplored. One obvious way is
to combine the PEFT method with active learning. However, the experimental
results show that such a combination is not trivial and yields inferior
results. Through probe experiments, such observation might be explained by two
main reasons: uncertainty gap and poor model calibration. Therefore, in this
paper, we propose a novel approach to effectively integrate uncertainty-based
active learning and LoRA. Specifically, for the uncertainty gap, we introduce a
dynamic uncertainty measurement that combines the uncertainty of the base model
and the uncertainty of the full model during the iteration of active learning.
For poor model calibration, we incorporate the regularization method during
LoRA training to keep the model from being over-confident, and the Monte-Carlo
dropout mechanism is employed to enhance the uncertainty estimation.
Experimental results show that the proposed approach outperforms existing
baseline models on three complex reasoning tasks.

通过结合基于不确定性的主动学习和 LoRA，本论文提出了一种新的方法，动态度量不确定性缺口且在 LoRA 训练中引入正则化方法，这种方法在三个复杂推理任务上优于现有的基线模型。

STAR：基于动态主动学习的约束式纵横比用于大型语言模型的高效微调

STAR: Constraint LoRA with Dynamic Active Learning for Data-Efficient  Fine-Tuning of Large Language Models

Large Language Models (LLMs) have showcased remarkable impacts across a wide
spectrum of natural language processing tasks. Fine-tuning these pre-trained
models on downstream datasets provides further significant performance gains,
but this process has been challenging due to its extraordinary resource
requirements. To this end, existing efforts focus on parameter-efficient
fine-tuning, which, unfortunately, fail to capitalize on the powerful potential
of full-parameter fine-tuning. In this work, we propose QFT, a novel Quantized
Full-parameter Tuning framework for LLMs that enables memory-efficient
fine-tuning without harming performance. Our framework incorporates two novel
ideas: (i) we adopt the efficient Lion optimizer, which only keeps track of the
momentum and has consistent update magnitudes for each parameter, an inherent
advantage for robust quantization; and (ii) we quantize all model states and
store them as integer values, and present a gradient flow and parameter update
scheme for the quantized weights. As a result, QFT reduces the model state
memory to 21% of the standard solution while achieving comparable performance,
e.g., tuning a LLaMA-7B model requires only <30GB of memory, satisfied by a
single A6000 GPU.

提出了一种名为 QFT 的新型量化全参数调优框架，可以实现内存高效调优而不损害性能。该框架采用高效的 Lion 优化器和整数值量化的模型状态存储方法，并提供了梯度流和参数更新方案。结果表明，QFT 将模型状态内存减少到标准解决方案的 21%，同时达到可比较的性能，例如，调优 LLaMA-7B 模型仅需 30GB 内存，一张 A6000 GPU 即可满足。

QFT：量子化的低资源 LLM 全参数调整

QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources

Parameter-efficient fine-tuning (PEFT) of pre-trained language models (PLMs)
has emerged as a highly successful approach, with training only a small number
of parameters without sacrificing performance and becoming the de-facto
learning paradigm with the increasing size of PLMs. However, existing PEFT
methods are not memory-efficient, because they still require caching most of
the intermediate activations for the gradient calculation, akin to fine-tuning.
One effective way to reduce the activation memory is to apply a reversible
model, so the intermediate activations are not necessary to be cached and can
be recomputed. Nevertheless, modifying a PLM to its reversible variant with
PEFT is not straightforward, since the reversible model has a distinct
architecture from the currently released PLMs. In this paper, we first
investigate what is a key factor for the success of existing PEFT methods, and
realize that it's essential to preserve the PLM's starting point when
initializing a PEFT method. With this finding, we propose memory-efficient
fine-tuning (MEFT) that inserts adapters into a PLM, preserving the PLM's
starting point and making it reversible without additional pre-training. We
evaluate MEFT on the GLUE benchmark and five question-answering tasks with
various backbones, BERT, RoBERTa, BART and OPT. MEFT significantly reduces the
activation memory up to 84% of full fine-tuning with a negligible amount of
trainable parameters. Moreover, MEFT achieves the same score on GLUE and a
comparable score on the question-answering tasks as full fine-tuning.

本文提出了一种内存高效的微调方法（MEFT），通过在预训练语言模型中插入适配器以保留 PLM 的起点并使其可逆，同时将激活内存降低到 84％的完全微调水平，并在 GLUE 基准测试中实现与完全微调相同的分数。