BriefGPT.xyz
Oct, 2024
循环变压器的表达能力:理论分析与时间步编码增强
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
HTML
PDF
Kevin Xu, Issei Sato
TL;DR
本研究解决了循环变压器在函数近似中的表达能力不足问题。通过定义序列到序列函数的连续性模,我们揭示了该循环架构的局限性,并提出在时间步编码下为每个循环引入缩放参数的方法。实验结果表明,增加循环次数可以提升性能,时间步编码架构则进一步增强了效果。
Abstract
Looped Transformers
offer advantages in parameter efficiency and Turing completeness. However, their expressive power for
Function Approximation
and approximation rate remains underexplored. In this paper, we est
→