BriefGPT.xyz
Oct, 2023
RLHF中的长度相关性研究
A Long Way to Go: Investigating Length Correlations in RLHF
HTML
PDF
Prasann Singhal, Tanya Goyal, Jiacheng Xu, Greg Durrett
TL;DR
通过针对回应长度进行优化,研究表明强化学习从人类反馈中能够取得显著的改进,该研究还探索了其他方法以在不增加长度的情况下实现模型性能的提升,并发现了回应长度在奖励模型方面的相关性。
Abstract
Great successes have been reported using
reinforcement learning from human feedback
(RLHF) to align large language models. Open-source preference datasets and
reward models
have enabled wider experimentation beyo
→