BriefGPT.xyz
Aug, 2023
人类反馈的迭代奖励塑造用于修正奖励误设定
Iterative Reward Shaping using Human Feedback for Correcting Reward Misspecification
HTML
PDF
Jasmina Gajcin, James McCarthy, Rahul Nair, Radu Marinescu, Elizabeth Daly...
TL;DR
提出了一种利用人类反馈进行迭代奖励塑形的方法(ITERS),允许用户在训练过程中提供轨迹级别的反馈,并结合用户解释来改进奖励函数,成功纠正错误的奖励函数。
Abstract
A well-defined
reward function
is crucial for successful training of an
reinforcement learning
(RL) agent. However, defining a suitable
reward fu
→