BriefGPT.xyz
May, 2021
跨域观测下的模仿学习
Cross-domain Imitation from Observations
HTML
PDF
Dripta S. Raychaudhuri, Sujoy Paul, Jeroen van Baar, Amit K. Roy-Chowdhury
TL;DR
本篇研究针对专家行为与训练代理之间的差异,提出了一种基于无配对无对齐的轨迹,以及循环一致性限制的框架,来学习对应关系以解决领域差异的问题,并通过实验证明了该方法的有效性。
Abstract
imitation learning
seeks to circumvent the difficulty in designing proper reward functions for training agents by utilizing expert behavior. With environments modeled as
markov decision processes
(MDP), most of t
→