BriefGPT.xyz
May, 2022
ConvMAE:掩码卷积与掩码自编码器相遇
ConvMAE: Masked Convolution Meets Masked Autoencoders
HTML
PDF
Peng Gao, Teli Ma, Hongsheng Li, Jifeng Dai, Yu Qiao
TL;DR
本文介绍使用ConvMAE框架对Vision Transformers进行特征预训练和Masked Auto-Encoder技术的引入,提高了其在各种视觉任务中的表现。而使用masked convolution和直接监督卷积层的features等方法,可以在保证计算效率的同时提高了分类和检测的准确率。
Abstract
vision transformers
(ViT) become widely-adopted architectures for various vision tasks.
masked auto-encoding
for feature pretraining and multi-scale
→