BriefGPT.xyz
Oct, 2024
大型语言模型对齐的加速偏好优化
Accelerated Preference Optimization for Large Language Model Alignment
HTML
PDF
Jiafan He, Huizhuo Yuan, Quanquan Gu
TL;DR
本研究针对大型语言模型(LLMs)与人类偏好对齐中的效率问题,提出了一种新的偏好优化框架。通过结合Nesterov动量技术,该框架加速了偏好优化过程,并在理论上证明了其收敛速度优于传统方法,实验结果也显示其在标准基准测试上的优越性。
Abstract
Reinforcement Learning
from Human Feedback (RLHF) has emerged as a pivotal tool for aligning
Large Language Models
(LLMs) with human preferences. Direct
→