A major paradigm for learning image representations in a self-supervised manner is to learn a model that is invariant to some predefined image transformations (cropping, blurring, color jittering, etc.), while regularizing the embedding distribution to avoid learning a degenerate solution. Our first contribution is to propose a general kernel framework to design a generic regularization loss that promotes the embedding distribution to be close to the uniform distribution on the hypersphere, with respect to the maximum mean discrepancy pseudometric. Our framework uses rotation-invariant kernels defined on the hypersphere, also known as dot-product kernels. Our second contribution is to show that this flexible kernel approach encompasses several existing self-supervised learning methods, including uniformity-based and information-maximization methods. Finally, by exploring empirically several kernel choices, our experiments demonstrate that using a truncated rotation-invariant kernel provides competitive results compared to state-of-the-art methods, and we show practical situations where our method benefits from the kernel trick to reduce computational complexity.

本研究提出了一种基于核平均嵌入的正则化损失，该损失使用在超球（也称为点积核）上具有旋转不变性的核，用于自监督学习图像表示。除了与现有技术竞争力充分之外，我们的方法显着减少了自监督训练的时间和内存复杂度，使其可以在现有设备上实现非常大的嵌入维度，且比以前的方法更容易适应资源有限的设置。

使用旋转不变核的自监督学习