BriefGPT.xyz
Oct, 2020
一张图像胜过16*16个单词:规模下的图像识别变形金刚
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
HTML
PDF
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai...
TL;DR
本文研究使用Transformer代替CNN进行图像分类,实现在计算资源少的情况下,取得比目前卷积网络更好的识别结果,从而在计算机视觉上取得突破。
Abstract
While the
transformer
architecture has become the de-facto standard for natural language processing tasks, its applications to
computer vision
remain limited. In vision,
→