Jun, 2024
视觉Transformer的每个阶段只需要更少的注意力
You Only Need Less Attention at Each Stage in Vision Transformers
TL;DRVision Transformers (ViTs) have revolutionized computer vision, but their computational complexity and attention saturation issues limit their practical application. The Less-Attention Vision Transformer (LaViT) proposes a novel approach that reduces the number of attention operations and leverages previously calculated attention scores, resulting in superior efficiency and performance across vision tasks.