BriefGPT.xyz
Feb, 2023
自监督教师的蒸馏
MOMA:Distill from Self-Supervised Teachers
HTML
PDF
Yuchong Yao, Nandakishor Desai, Marimuthu Palaniswami
TL;DR
提出一种名为MOMA的框架,通过三种不同的知识转移机制在自我监督的方式下,将来自MoCo和MAE的知识合作起来,从而产生紧凑的学生模型,在计算效率方面具有极高的蒙面率和显著降低的训练代数,实验证明MOMA在计算机视觉方面的不同基准测试中都具有竞争力。
Abstract
Contrastive Learning and Masked Image Modelling have demonstrated exceptional performance on
self-supervised representation learning
, where Momentum Contrast (i.e.,
moco
) and Masked AutoEncoder (i.e.,
→