The goal of imitation learning is to match example expert behavior, without access to a reinforcement signal. Expert demonstrations provided by humans, however, often show significant variability due to latent factors that are not explicitly modeled. We introduce an extension to the Generative Adversarial Imitation Learning method that can infer the latent structure of human decision-making in an unsupervised way. Our method can not only imitate complex behaviors, but also learn interpretable and meaningful representations. We demonstrate that the approach is applicable to high-dimensional environments including raw visual inputs. In the highway driving domain, we show that a model learned from demonstrations is able to both produce different styles of human-like driving behaviors and accurately anticipate human actions. Our method surpasses various baselines in terms of performance and functionality.

本文提出了一种基于对抗生成模型的模仿学习算法，能够通过无监督学习方法推断出专家示范中隐藏的潜在结构，并可以学习到复杂行为数据可解释且有意义的表示方式，包括图像示范。在驾驶领域中，我们展示了通过人类示范学习的模型能够准确地复现多种行为并能使用原始视觉输入准确地预测人类的行为。相比于其他基线算法，我们的方法能更好地捕捉隐藏在专家示范中的潜在结构，并经常回收到语义上有意义的数据变量。

InfoGAIL：来自视觉示范的可解释性模仿学习