BriefGPT.xyz
May, 2022
使用视觉Transformer进行简单的开放词汇物体检测
Simple Open-Vocabulary Object Detection with Vision Transformers
HTML
PDF
Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn...
TL;DR
采用对比式图文预训练和端到端检测微调方法,结合扩展的图像预训练和模型尺度的优化,实现了基于 Vision Transformer 的开放词汇目标检测的零样本和单样本条件下的行为表现。
Abstract
Combining simple architectures with large-scale
pre-training
has led to massive improvements in image classification. For
object detection
,
pre-t
→