Recent studies applied Parameter Efficient Fine-Tuning techniques (PEFTs) to efficiently narrow the performance gap between pre-training and downstream. There are two important factors for various PEFTs, namely, the accessible data size and fine-tunable parameter size. A natural expectation for PEFTs is that the performance of various PEFTs is positively related to the data size and fine-tunable parameter size. However, according to the evaluation of five PEFTs on two downstream vision-language (VL) tasks, we find that such an intuition holds only if the downstream data and task are not consistent with pre-training. For downstream fine-tuning consistent with pre-training, data size no longer affects the performance, while the influence of fine-tunable parameter size is not monotonous. We believe such an observation could guide the choice of training strategy for various PEFTs.

最近的研究应用了参数高效微调技术（PEFTs）来有效缩小预训练和下游任务之间的性能差距。该研究发现，对于与预训练一致的下游微调任务，数据规模不再影响性能，而可微参数规模的影响并不单调，这种观察可指导PEFTs的训练策略选择。

视觉语言预训练模型参数高效微调的实证研究