BriefGPT.xyz
May, 2023
如何在强化学习中高效地查询人类反馈?
How to Query Human Feedback Efficiently in RL?
HTML
PDF
Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee
TL;DR
研究提出了一种有效的轨迹对采样方法,用于探索隐藏的奖励函数,以便在收集人类反馈之前准确地学习,比现有文献更少地需要人类反馈量来学习基于偏好模型的最优策略,可以考虑线性和低秩MDP
Abstract
reinforcement learning
with
human feedback
(RLHF) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories, rather than explicit reward signals. Whi
→