Recent work has sought to understand the behavior of neural networks by comparing representations between layers and between different trained models. We examine methods for comparing neural network representations based on canonical correlation analysis (CCA). We show that CCA belongs to a family of statistics for measuring multivariate similarity, but that neither CCA nor any other statistic that is invariant to invertible linear transformation can measure meaningful similarities between representations of higher dimension than the number of data points. We introduce a similarity index that measures the relationship between representational similarity matrices and does not suffer from this limitation. This similarity index is equivalent to centered kernel alignment (CKA) and is also closely connected to CCA. Unlike CCA, CKA can reliably identify correspondences between representations in networks trained from different initializations.

本文介绍了基于规范相关分析（CCA）方法的神经网络表示比较方法，并提出了一种相似度指数来测量表示相似性矩阵之间的关系，该指数与中心核对齐（CKA）密切相关，但不受高维表示限制，具有可靠性。与CCA不同，CKA方法可在不同初始化的网络表示中可靠地识别对应关系。

神经网络表示的相似性再探讨