模仿学习中的因果混淆

May, 2019

Causal Confusion in Imitation Learning

Pim de Haan, Dinesh Jayaraman, Sergey Levine

TL;DR利用行为克隆将策略学习简化为监督学习，但忽略因果关系可能导致因果误识问题，可通过相应的干预（环境交互或专家查询）确定正确的因果模型来解决。研究表明，该问题在多个领域中都存在，例如控制问题和驾驶问题，并经过了与DAgger等基线和消融进行验证。

Abstract

behavioral cloning reduces policy learning to supervised learning by training a discriminative model to predict expert actions given observations. Such discriminative models are non-causal: the training procedure