逆强化学习环境设计

Oct, 2022

Environment Design for Inverse Reinforcement Learning

Thomas Kleine Buening, Christos Dimitrakakis

TL;DR通过适应性设计专家演示环境，改善学习效率和鲁棒性，解决基于专家演示学习和环境动力学变化下的奖励函数学习挑战。

Abstract

The task of learning a reward function from expert demonstrations suffers from high sample complexity as well as inherent limitations to what can be learned from demonstrations in a given environment. As the samples used for →