Oct, 2023
快速多极注意力:一种长序列的分而治之注意机制
Fast Multipole Attention: A Divide-and-Conquer Attention Mechanism for Long Sequences
Yanming Kang, Giang Tran, Hans De Sterck
TL;DRTransformer-based models have achieved state-of-the-art performance, but the quadratic complexity of self-attention limits their applicability to long sequences; Fast Multipole Attention addresses this issue by reducing time and memory complexity, while maintaining a global receptive field with a hierarchical approach.