Transformer models can use two fundamentally different kinds of information:
information stored in weights during training, and information provided
``in-context'' at inference time. In this work, we show that transformers
exhibit different inductive biases in how they represent and generalize from
the information in these two sources. In particular, we characterize whether
they generalize via parsimonious rules (rule-based generalization) or via
direct comparison with observed examples (exemplar-based generalization). This
is of important practical consequence, as it informs whether to encode
information in weights or in context, depending on how we want models to use
that information. In transformers trained on controlled stimuli, we find that
generalization from weights is more rule-based whereas generalization from
context is largely exemplar-based. In contrast, we find that in transformers
pre-trained on natural language, in-context learning is significantly
rule-based, with larger models showing more rule-basedness. We hypothesise that
rule-based generalization from in-context information might be an emergent
consequence of large-scale training on language, which has sparse rule-like
structure. Using controlled stimuli, we verify that transformers pretrained on
data containing sparse rule-like structure exhibit more rule-based
generalization.

本文研究 Transformer 模型的归纳偏差，发现预训练模型在处理稀少类似规则的数据时更倾向于基于规则的归纳，而在无监督学习上表现出基于例子的归纳偏差。