BriefGPT.xyz
Nov, 2024
打破线性注意力的低秩困境
Breaking the Low-Rank Dilemma of Linear Attention
HTML
PDF
Qihang Fan, Huaibo Huang, Ran He
TL;DR
本研究针对线性注意力机制在处理图像任务时性能下降的问题,提出了从键值缓冲区和输出特征两个方面进行秩分析的创新方法。通过引入秩增强线性注意力(RALA),我们构建了秩增强视觉线性变换器(RAVLT),其在多个视觉任务中表现优异,尤其是在ImageNet-1k数据集上达到84.4%的Top-1准确率,展示了该方法的巨大潜力。
Abstract
The
Softmax Attention
mechanism in Transformer models is notoriously computationally expensive, particularly due to its quadratic complexity, posing significant challenges in vision applications. In contrast,
Linear Att
→