Analyzing the similarity of internal representations within and across different models has been an important technique for understanding the behavior of deep neural networks. Most existing methods for analyzing the similarity between representations of high dimensions, such as those based on Canonical Correlation Analysis (CCA) and widely used Centered Kernel Alignment (CKA), rely on statistical properties of the representations for a set of data points. In this paper, we focus on transformer models and study the similarity of representations between the hidden layers of individual transformers. In this context, we show that a simple sample-wise cosine similarity metric is capable of capturing the similarity and aligns with the complicated CKA. Our experimental results on common transformers reveal that representations across layers are positively correlated, albeit the similarity decreases when layers are far apart. We then propose an aligned training approach to enhance the similarity between internal representations, with trained models that enjoy the following properties: (1) the last-layer classifier can be directly applied right after any hidden layers, yielding intermediate layer accuracies much higher than those under standard training, (2) the layer-wise accuracies monotonically increase and reveal the minimal depth needed for the given task, (3) when served as multi-exit models, they achieve on-par performance with standard multi-exit architectures which consist of additional classifiers designed for early exiting in shallow layers. To our knowledge, our work is the first to show that one common classifier is sufficient for multi-exit models. We conduct experiments on both vision and NLP tasks to demonstrate the performance of the proposed aligned training.

我们研究了transformer模型中隐藏层之间的表示相似性，并展示了一个简单的样本级余弦相似度度量能够捕捉到这种相似性，并与复杂的统计方法CCA一致，通过提出对齐训练方法，我们增强了内部表示之间的相似性，并得到了具有多个隐藏层输出的模型，与标准训练相比，这些模型在中间层具有更高的准确性，并且当作为多出口模型时，它们能够与标准的多出口架构达到相当的性能，而我们的工作是首次证明一个普通分类器就足够用于多出口模型。

关于逐层表示相似性的研究：用于具有单个分类器的多出口模型的应用