BriefGPT.xyz
Jun, 2021
RegionViT: 基于区域到局部的视觉转换器注意力机制
RegionViT: Regional-to-Local Attention for Vision Transformers
HTML
PDF
Chun-Fu Chen, Rameswar Panda, Quanfu Fan
TL;DR
本文提出了一种采用金字塔结构和新的区域到局部注意力的视觉transformer(ViT)架构,可以在图像分类和目标检测等四个任务上优于目前最先进的视觉transformer(ViT)变体。
Abstract
vision transformer
(ViT) has recently showed its strong capability in achieving comparable results to convolutional neural networks (CNNs) on
image classification
. However, vanilla ViT simply inherits the same ar
→