TL;DR提出了一种基于 State Space Models 的 Multi-Head Scan (MHS) 模块,通过 1D selective scan 在 2D 图像空间内构建视觉特征,结合了 Scan Route Attention (SRA) 机制来提高模块的复杂结构辨识能力,并在实验证明了该方法在性能上的显著改善与参数减少。
Abstract
Recently, state space models (SSMs), with mamba as a prime example, have shown great promise for long-range dependency modeling with linear complexity. Then, Vision →