BriefGPT.xyz
Oct, 2023
强化微调语言模型中的梯度消失
Vanishing Gradients in Reinforcement Finetuning of Language Models
HTML
PDF
Noam Razin, Hattie Zhou, Omid Saremi, Vimal Thilak, Arwen Bradley...
TL;DR
RFT中存在梯度消失问题,通过实验和理论分析,表明小奖励标准差导致梯度消失是普遍和不利的,而对初始监督微调阶段的常见做法是最有前景的候选方法,此阶段可以是计算和数据标注工作方面较为经济的较少步骤,对成功执行RFT至关重要。
Abstract
pretrained language models
are commonly aligned with human preferences and downstream tasks via
reinforcement finetuning
(RFT), which entails maximizing a (possibly learned) reward function using policy gradient
→