BriefGPT.xyz
May, 2019
模仿学习中的因果混淆
Causal Confusion in Imitation Learning
HTML
PDF
Pim de Haan, Dinesh Jayaraman, Sergey Levine
TL;DR
利用行为克隆将策略学习简化为监督学习,但忽略因果关系可能导致因果误识问题,可通过相应的干预(环境交互或专家查询)确定正确的因果模型来解决。研究表明,该问题在多个领域中都存在,例如控制问题和驾驶问题,并经过了与DAgger等基线和消融进行验证。
Abstract
behavioral cloning
reduces
policy learning
to supervised learning by training a discriminative model to predict expert actions given observations. Such discriminative models are non-causal: the training procedure
→