BriefGPT.xyz
Aug, 2024
变压器是最小最大最优的非参数上下文学习者
Transformers are Minimax Optimal Nonparametric In-Context Learners
HTML
PDF
Juno Kim, Tai Nakamaki, Taiji Suzuki
TL;DR
本文研究了大型语言模型的上下文学习(ICL)在统计学习理论中的有效性,提出了变压器在非参数回归任务中的逼近和泛化误差界限。研究表明,经过充分训练的变压器不仅能够实现最小最大最优的估计风险,还能在上下文中提升表示能力,进而揭示任务多样性和表征学习在ICL中的关键作用。
Abstract
In-context Learning
(ICL) of large language models has proven to be a surprisingly effective method of learning a new task from only a few demonstrative examples. In this paper, we study the efficacy of ICL from the viewpoint of
→