In this paper, we propose a novel training procedure for the continual representation learning problem in which a neural network model is sequentially learned to alleviate catastrophic forgetting in visual search tasks. Our method, called Contrastive Supervised Distillation (CSD), reduces feature forgetting while learning discriminative features. This is achieved by leveraging labels information in a distillation setting in which the student model is contrastively learned from the teacher model. Extensive experiments show that CSD performs favorably in mitigating catastrophic forgetting by outperforming current state-of-the-art methods. Our results also provide further evidence that feature forgetting evaluated in visual retrieval tasks is not as catastrophic as in classification tasks. Code at: https://github.com/NiccoBiondi/ContrastiveSupervisedDistillation.

本文提出了一种名为对比监督蒸馏（CSD）的训练过程，用于解决连续表征学习中的灾难性遗忘问题，如何通过利用蒸馏设置中的标签信息来降低特征遗忘并学习有区别力的特征，从而使学生模型从教师模型中进行对比学习，在视觉检索任务中缓解灾难性遗忘，且表现优于当前的最新方法。

对比有监督蒸馏用于连续表示学习