DMT：多个自监督教师进行的全面蒸馏

Dec, 2023

DMT：多个自监督教师进行的全面蒸馏

DMT: Comprehensive Distillation with Multiple Self-supervised Teachers

Yuang Liu, Jing Wang, Qiang Zhou, Fan Wang, Jun Wang...

TL;DR通过利用多个自监督模型的优势，压缩预训练模型，并在分类任务和密集任务中显著提高性能。

Abstract

Numerous self-supervised learning paradigms, such as contrastive learning and masked image modeling, have been proposed to acquire powerful and general representations from unlabeled data. However, these models are commonly pretrained within their specific framework alone, failing to c