行为克隆中打击模仿代理的研究

Oct, 2020

Fighting Copycat Agents in Behavioral Cloning from Observation Histories

Chuan Wen, Jierui Lin, Trevor Darrell, Dinesh Jayaraman, Yang Gao

TL;DR本文提出了在部分观测到的情况下，对于专家动作序列上的重复利用问题的对抗性解决方案，以提高在多个部分观测的模仿学习任务中的性能。

Abstract

imitation learning trains policies to map from input observations to the actions that an expert would choose. In this setting, distribution shift frequently exacerbates the effect of misattributing expert actions