There has been a growing interest in interpreting the underlying dynamics of
Transformers. While self-attention patterns were initially deemed as the
primary option, recent studies have shown that integrating other components can
yield more accurate explanations. This paper introduces a novel token
attribution analysis method that incorporates all the components in the encoder
block and aggregates this throughout layers. Through extensive quantitative and
qualitative experiments, we demonstrate that our method can produce faithful
and meaningful global token attributions. Our experiments reveal that
incorporating almost every encoder component results in increasingly more
accurate analysis in both local (single layer) and global (the whole model)
settings. Our global attribution analysis significantly outperforms previous
methods on various tasks regarding correlation with gradient-based saliency
scores. Our code is freely available at
this https URL.

本文提出了一种新的 token 指定分析方法，将编码器块中的所有组件结合起来并在各层中进行聚合，通过广泛的定量和定性实验，证明我们的方法可以产生忠实和有意义的全局 token 指定，引入几乎每个编码器组件在本地 (单层) 和全局 (整个模型) 设置下的分析结果表明，我们的全局指定分析在与基于梯度的显著性分数相关的各种任务方面显著优于先前的方法，我们的代码可在此 https URL 免费获取。

GlobEnc: 在 Transformer 中使用整个编码器层来量化全局标记归因

GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers

In this paper, we propose a simple and effective technique to allow for
efficient self-supervised learning with bi-directional Transformers. Our
approach is motivated by recent studies demonstrating that self-attention
patterns in trained models contain a majority of non-linguistic regularities.
We propose a computationally efficient auxiliary loss function to guide
attention heads to conform to such patterns. Our method is agnostic to the
actual pre-training objective and results in faster convergence of models as
well as better performance on downstream tasks compared to the baselines,
achieving state of the art results in low-resource settings. Surprisingly, we
also find that linguistic properties of attention heads are not necessarily
correlated with language modeling performance.

该文提出了一种利用双向 Transformer 实现高效自监督学习的简单而有效的技术，该方法利用辅助损失函数引导注意力头符合自注意力特征，并可以适用于不同的预训练目标，实验证明该方法相对于基线模型更快收敛同时在下游任务中性能更好，在低资源环境中取得了业界领先结果。