CVPRJun, 2024

视觉 Transformer 的每个阶段只需要更少的注意力

TL;DRVision Transformers (ViTs) have revolutionized computer vision, but their computational complexity and attention saturation issues limit their practical application. The Less-Attention Vision Transformer (LaViT) proposes a novel approach that reduces the number of attention operations and leverages previously calculated attention scores, resulting in superior efficiency and performance across vision tasks.