TL;DR通过 LIMA 的实验,本研究发现几乎所有大型语言模型的知识都是在预训练阶段中学习的,只需要有限的指导训练数据就足以教导模型产生高质量的输出。
Abstract
large language models are trained in two stages: (1) unsupervised pretraining
from raw text, to learn general-purpose representations, and (2) large scale
instruction tuning and reinforcement learning, to better