BriefGPT.xyz
Aug, 2021
强化学习中的模仿学习
Imitation Learning by Reinforcement Learning
HTML
PDF
Kamil Ciosek
TL;DR
针对确定性专家,本文使用固定奖励将模仿学习降为强化学习问题,并证实了可以恢复专家的奖励,并将模仿者与专家之间的总变化距离等同于对抗模仿学习,针对连续控制任务进行了实验确认降维的有效性。
Abstract
imitation learning
algorithms learn a policy from demonstrations of
expert behavior
. Somewhat counterintuitively, we show that, for deterministic experts,
→