To understand neural network behavior, recent works quantitatively compare different networks' learned representations using canonical correlation analysis (CCA), centered kernel alignment (CKA), and other dissimilarity measures. Unfortunately, these widely used measures often disagree on fundamental observations, such as whether deep networks differing only in random initialization learn similar representations. These disagreements raise the question: which, if any, of these dissimilarity measures should we believe? We provide a framework to ground this question through a concrete test: measures should have sensitivity to changes that affect functional behavior, and specificity against changes that do not. We quantify this through a variety of functional behaviors including probing accuracy and robustness to distribution shift, and examine changes such as varying random initialization and deleting principal components. We find that current metrics exhibit different weaknesses, note that a classical baseline performs surprisingly well, and highlight settings where all metrics appear to fail, thus providing a challenge set for further improvement.

本文提供了一个框架来验证神经网络的行为，通过功能行为敏感性和特异性等方面测试常用的神经网络模型评估方法，研究发现当前的评估指标存在不同缺陷，一个经典的基线表现出人意料的好，作者对所有指标都无法胜任的场景进行了强调，并为未来的研究提供了挑战性的基础数据。

利用统计检验来建立表示相似性