BriefGPT.xyz
Dec, 2024
基于人类反馈的奖励学习中的自适应查询
Adaptive Querying for Reward Learning from Human Feedback
HTML
PDF
Yashwanthi Anand, Sandhya Saisubramanian
TL;DR
本研究解决了现有基于人类反馈的学习方法单一查询形式的问题,提出了一个能够利用多种用户交互模式的自适应反馈选择框架。通过优化查询状态和反馈格式,该方法有效提高了对不安全行为的惩罚函数学习,并在模拟评估中展示了其样本效率。
Abstract
Learning from
Human Feedback
is a popular approach to train robots to adapt to user preferences and improve
Safety
. Existing approaches typically consider a single querying (interaction) format when seeking
→