BriefGPT.xyz
Nov, 2024
打破线性注意力的低秩困境
Breaking the Low-Rank Dilemma of Linear Attention
HTML
PDF
Qihang Fan, Huaibo Huang, Ran He
TL;DR
本研究解决了线性注意力相较于Softmax注意力在复杂空间信息建模中的性能下降问题,提出了增秩线性注意力(RALA),以保持线性复杂度同时实现与Softmax注意力相媲美的性能。此外,通过构建增秩视觉线性变换器(RAVLT),实验结果表明RAVLT在多个视觉任务中表现优异,尤其在ImageNet-1k上取得了84.4%的Top-1准确率,显示出RALA的巨大潜力。
Abstract
The
Softmax Attention
mechanism in
Transformer Models
is notoriously computationally expensive, particularly due to its quadratic complexity, posing significant challenges in vision applications. In contrast,
→