BriefGPT.xyz
May, 2018
通过元反强化学习学习意图的先验知识
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning
HTML
PDF
Kelvin Xu, Ellis Ratner, Anca Dragan, Sergey Levine, Chelsea Finn
TL;DR
本文通过学习先验(prior)函数从其他任务的演示中推断奖励函数(reward functions),以优化从有限的演示中推断表达丰富的奖励函数的能力,并演示了该方法可以有效地从图像中恢复新任务的奖励。
Abstract
A significant challenge for the practical application of
reinforcement learning
in the real world is the need to specify an
oracle reward function
that correctly defines a task. Inverse
→