Learning from human feedback is a popular approach to train robots to adapt to user preferences and improve safety. Existing approaches typically consider a single querying (interaction) format when seeking human feedback and do not leverage multiple modes of user interaction with a robot. We examine how to learn a penalty function associated with unsafe behaviors, such as side effects, using multiple forms of human feedback, by optimizing the query state and feedback format. Our framework for adaptive feedback selection enables querying for feedback in critical states in the most informative format, while accounting for the cost and probability of receiving feedback in a certain format. We employ an iterative, two-phase approach which first selects critical states for querying, and then uses information gain to select a feedback format for querying across the sampled critical states. Our evaluation in simulation demonstrates the sample efficiency of our approach.

本研究解决了现有基于人类反馈的学习方法单一查询形式的问题，提出了一个能够利用多种用户交互模式的自适应反馈选择框架。通过优化查询状态和反馈格式，该方法有效提高了对不安全行为的惩罚函数学习，并在模拟评估中展示了其样本效率。

基于人类反馈的奖励学习中的自适应查询