BriefGPT.xyz
May, 2024
通过图算法理解Transformer推理能力
Understanding Transformer Reasoning Capabilities via Graph Algorithms
HTML
PDF
Clayton Sanford, Bahare Fatemi, Ethan Hall, Anton Tsitsulin, Mehran Kazemi...
TL;DR
Transformer自动缩放机制、算法推理能力的研究(深度、宽度、额外标记数)及在图推理任务中的优秀表现。
Abstract
Which
transformer scaling regimes
are able to perfectly solve different classes of algorithmic problems? While tremendous empirical advances have been attained by transformer-based neural networks, a theoretical understanding of their
→