BriefGPT.xyz
Nov, 2021
利用多尺度令牌聚合的深层自注意力机制
Shunted Self-Attention via Multi-Scale Token Aggregation
HTML
PDF
Sucheng Ren, Daquan Zhou, Shengfeng He, Jiashi Feng, Xinchao Wang
TL;DR
本文提出一种名为SSA的新型自注意力策略,能够使Vision Transformer模型在单个自注意力层上实现对多种尺度特征的建模,并得到了广泛验证和超越同类模型的结果。
Abstract
Recent
vision transformer
~(ViT) models have demonstrated encouraging results across various computer vision tasks, thanks to their competence in modeling long-range dependencies of image patches or tokens via
self-atten
→