Nov, 2023

大语言模型中的语言普适性有多抽象?论阐探有关论证结构

TL;DRTransformer-based large language models perform well in generalizing word distributions in related contexts seen during pre-training, but fail in generalizations between unobserved contexts by relying on linear order instead of more abstract structural generalizations.