BriefGPT.xyz
Aug, 2021
Transformer 模型解决复合任务
Making Transformers Solve Compositional Tasks
HTML
PDF
Santiago Ontañón, Joshua Ainslie, Vaclav Cvicek, Zachary Fisher
TL;DR
通过探索Transformer模型的设计空间,我们发现一些设计上的决策对该模型的归纳偏差有很大的影响。我们发现这些决策可以显著地提高Transformer模型的组合泛化能力,并在各种复合任务中实现了比文献报道的更好的泛化结果,并在语义分析组合泛化基准(COGS)和字符串编辑操作组合基准(PCFG)中实现了最先进的结果。
Abstract
Several studies have reported the inability of
transformer models
to generalize compositionally, a key type of generalization in many NLP tasks such as
semantic parsing
. In this paper we explore the design space
→