Attribution scores indicate the importance of different input parts and can, thus, explain model behaviour. Currently, prompt-based models are gaining popularity, i.a., due to their easier adaptability in low-resource settings. However, the quality of attribution scores extracted from prompt-based models has not been investigated yet. In this work, we address this topic by analyzing attribution scores extracted from prompt-based models w.r.t. plausibility and faithfulness and comparing them with attribution scores extracted from fine-tuned models and large language models. In contrast to previous work, we introduce training size as another dimension into the analysis. We find that using the prompting paradigm (with either encoder-based or decoder-based models) yields more plausible explanations than fine-tuning the models in low-resource settings and Shapley Value Sampling consistently outperforms attention and Integrated Gradients in terms of leading to more plausible and faithful explanations.

通过分析从基于提示的模型中提取的归因得分的合理性和忠实性，并将其与从微调模型和大型语言模型中提取的归因得分进行比较，我们发现使用基于提示的范例（无论是基于编码器的模型还是解码器的模型）比在低资源环境下微调模型产生更合理的解释，并且Sha​pley Value Sampling在产生更合理和忠实的解释方面始终优于注意力和积分梯度。

低资源环境下的预训练语言模型解释与归因分析