BriefGPT.xyz
Jun, 2020
关于Transformer的计算能力及其对序列建模的启示
On the Computational Power of Transformers and Its Implications in Sequence Modeling
HTML
PDF
Satwik Bhattamishra, Arkil Patel, Navin Goyal
TL;DR
这篇论文研究了变形金刚网络的计算能力与图灵完备性,得出了只有通过位置掩蔽而没有位置编码的变形金刚同样具有图灵完备性,而某些残差连接是必需的结论,并通过机器翻译和合成任务的实验说明了结果的实际应用。
Abstract
transformers
are being used extensively across several sequence modeling tasks. Significant research effort has been devoted to experimentally probe the inner workings of
transformers
. However, our conceptual and
→