BriefGPT.xyz
Oct, 2023
使用小型语言模型来微调大型语言模型的仿真器
An Emulator for Fine-Tuning Large Language Models using Small Language Models
HTML
PDF
Eric Mitchell, Rafael Rafailov, Archit Sharma, Chelsea Finn, Christopher D. Manning
TL;DR
通过借鉴RL的框架,引入了一种名为模拟微调(EFT)的技术,从而将预训练和微调的知识与技能解耦,并且通过扩大微调的规模来提高可帮助性,扩大预训练的规模来提高事实性,从而实现在测试时调整不同行为特征的方法,而无需额外训练。
Abstract
Widely used
language models
(LMs) are typically built by
scaling up
a two-stage training pipeline: a
pre-training
stage that uses a very l
→