This work introduces ATTEMPT (Attentional Mixture of Prompt Tuning), a new modular, multi-task, and parameter-efficient language model (LM) tuning approach that combines knowledge transferred across different tasks via a mixture of soft prompts while keeping original LM unchanged. ATTEMPT interpolates a set of prompts trained on large-scale source tasks and a newly initialized target task prompt using instance-wise attention computed by a lightweight sub-network trained on multiple target tasks. ATTEMPT is parameter-efficient (e.g., updates 1,600 times fewer parameters than fine-tuning) and enables multi-task learning and flexible extensions; importantly, it is also more interpretable because it demonstrates which source tasks affect the final model decision on target tasks. Experimental results across 17 diverse datasets show that ATTEMPT improves prompt tuning by up to a 22% absolute performance gain and outperforms or matches fully fine-tuned or other parameter-efficient tuning approaches that use over ten times more parameters.

本论文提出了一种名为 ATTEMPT 的新型多任务、参数高效的语言模型微调方法，通过简短的前缀嵌入向量预先训练不同任务，学习跨任务传递知识。该方法通过源提示的编码，在每个实例中对目标任务进行源提示和新初始化的目标提示的插值训练注意力模块。在训练期间，仅更新目标任务提示和注意权重，同时保持原始语言模型和源提示不变。实验结果表明，ATTEMPT 显著优于提示微调，并优于或匹配完全微调或使用超过十倍参数的其他参数高效调整方法。最后，在少次学习设置下，ATTEMPT 优于以前的工作。

基于软提示的注意力混合多任务调参的参数高效化尝试