Widely used language models (LMs) are typically built by scaling up a
two-stage training pipeline: a pre-training stage that uses a very large,
diverse dataset of text and a fine-tuning (sometimes, 'alignment') stage that
uses targeted examples or other specifications of desired behaviors. While it
has been hypothesized that knowledge and skills come from pre-training, and
fine-tuning mostly filters this knowledge and skillset, this intuition has not
been extensively tested. To aid in doing so, we introduce a novel technique for
decoupling the knowledge and skills gained in these two stages, enabling a
direct answer to the question, "What would happen if we combined the knowledge
learned by a large model during pre-training with the knowledge learned by a
small model during fine-tuning (or vice versa)?" Using an RL-based framework
derived from recent developments in learning from human preferences, we
introduce emulated fine-tuning (EFT), a principled and practical method for
sampling from a distribution that approximates (or 'emulates') the result of
pre-training and fine-tuning at different scales. Our experiments with EFT show
that scaling up fine-tuning tends to improve helpfulness, while scaling up
pre-training tends to improve factuality. Beyond decoupling scale, we show that
EFT enables test-time adjustment of competing behavioral traits like
helpfulness and harmlessness without additional training. Finally, a special
case of emulated fine-tuning, which we call LM up-scaling, avoids
resource-intensive fine-tuning of large pre-trained models by ensembling them
with small fine-tuned models, essentially emulating the result of fine-tuning
the large pre-trained model. Up-scaling consistently improves helpfulness and
factuality of instruction-following models in the Llama, Llama-2, and Falcon
families, without additional hyperparameters or training.

通过借鉴 RL 的框架，引入了一种名为模拟微调（EFT）的技术，从而将预训练和微调的知识与技能解耦，并且通过扩大微调的规模来提高可帮助性，扩大预训练的规模来提高事实性，从而实现在测试时调整不同行为特征的方法，而无需额外训练。