BriefGPT.xyz
May, 2019
基于f-差距最小化的模仿学习
Imitation Learning as $f$-Divergence Minimization
HTML
PDF
Liyiming Ke, Matt Barnes, Wen Sun, Gilwoo Lee, Sanjiban Choudhury...
TL;DR
本文提出了一种使用多模演示的模仿学习方法,针对现有方法中插值错误的问题,采用与专家状态-行动分布的正向KL散度相对应的反向KL散度,即I-projection,作为不同f-散度估计和最小化的框架,并得出了比GAIL和行为克隆更加可靠的多模行为近似I-projection方法。
Abstract
We address the problem of
imitation learning
with
multi-modal demonstrations
. Instead of attempting to learn all modes, we argue that in many tasks it is sufficient to imitate any one of them. We show that the st
→