We present a novel Parameter-Efficient Fine-Tuning (PEFT) method, dubbed as Adaptive Freezing of Low Rank Adaptation (AFLoRA). Specifically, for each pre-trained frozen weight tensor, we add a parallel path of trainable low-rank matrices, namely a down-projection and an up-projection matrix, each of which is followed by a feature transformation vector. Based on a novel freezing score, we the incrementally freeze these projection matrices during fine-tuning to reduce the computation and alleviate over-fitting. Our experimental results demonstrate that we can achieve state-of-the-art performance with an average improvement of up to $0.85\%$ as evaluated on GLUE benchmark while yeilding up to $9.5\times$ fewer average trainable parameters. While compared in terms of runtime, AFLoRA can yield up to $1.86\times$ improvement as opposed to similar PEFT alternatives. Besides the practical utility of our approach, we provide insights on the trainability requirements of LoRA paths at different modules and the freezing schedule for the different projection matrices. Code will be released.

提出一种新的参数高效微调方法(AFLoRA)，通过冻结权重张量并添加平行的可训练低秩矩阵路径(down-projection和up-projection矩阵)，在微调过程中根据冻结得分逐渐冻结投影矩阵，以减少计算量并减轻过拟合，实验证明与GLUE基准相比，可以达到最先进的性能，平均性能提高了0.85%，同时平均可训练参数减少了9.5倍，并且与类似的参数高效微调方法相比速度提高了1.86倍。提供了LoRA路径在不同模块的可训练性要求以及投影矩阵的冻结时间表的见解。

AFLoRA：大型模型参数高效微调中自适应冻结低秩适应