BriefGPT.xyz
Jun, 2021
饱和变压器是常深度阈值电路
On the Power of Saturated Transformers: A View from Circuit Complexity
HTML
PDF
William Merrill, Yoav Goldberg, Roy Schwartz, Noah A. Smith
TL;DR
这篇论文研究了使用软饱和注意力机制的Transformer模型的电路复杂度,证明了其能够被常数深度阈值电路模拟,限制了该模型在形式语言上的能力。
Abstract
transformers
have become a standard architecture for many
nlp
problems. This has motivated theoretically analyzing their capabilities as models of language, in order to understand what makes them successful, and
→