BriefGPT.xyz
May, 2018
从观察中模仿潜在策略
Imitating Latent Policies from Observation
HTML
PDF
Ashley D. Edwards, Himanshu Sahni, Yannick Schroeker, Charles L. Isbell
TL;DR
本文提出了一种新的模仿学习方法,直接从状态观测推断潜在策略,并引入了一种方法来描述潜在动作对观测的因果影响,同时预测它们的可能性,从而确定潜在和实际行为之间的映射。本文在经典控制环境和平台游戏中评估了该方法,并表明它的性能优于标准方法。
Abstract
We describe a novel approach to
imitation learning
that infers
latent policies
directly from state observations. We introduce a method that characterizes the
→