TL;DR该文介绍了 Model based Adversarial Imitation Learning (MAIL)算法,为对抗性模仿学习问题提供了一种基于模型的方法,使用前向模型使该系统完全可微分,以训练出优秀的策略。在MuJoCo物理模拟器上测试后,该方法的初始结果超过了当前的最优状态。
Abstract
generative adversarial learning is a popular new approach to training generative models which has been proven successful for other related problems as well. The general idea is to maintain an oracle $D$ that discriminates between the expert's data distribution and that of the generativ