多层感知器学习上下文

May, 2024

MLPs Learn In-Context

William L. Tong, Cengiz Pehlevan

TL;DR在这项研究中，我们发现多层感知器（MLPs）和密切相关的MLP-Mixer模型可以像Transformer模型一样有效地进行上下文学习，并且在一些涉及关系推理的任务中，MLPs表现更优，这一结果挑战了以往对简单连通模型的一些假设。

Abstract

in-context learning (ICL), the remarkable ability to solve a task from only input exemplars, has commonly been assumed to be a unique hallmark of transformer models. In this study, we demonstrate that