BriefGPT.xyz
Oct, 2020
行为克隆中打击模仿代理的研究
Fighting Copycat Agents in Behavioral Cloning from Observation Histories
HTML
PDF
Chuan Wen, Jierui Lin, Trevor Darrell, Dinesh Jayaraman, Yang Gao
TL;DR
本文提出了在部分观测到的情况下,对于专家动作序列上的重复利用问题的对抗性解决方案,以提高在多个部分观测的模仿学习任务中的性能。
Abstract
imitation learning
trains policies to map from input observations to the actions that an expert would choose. In this setting,
distribution shift
frequently exacerbates the effect of misattributing expert actions
→