BriefGPT.xyz
Jun, 2019
目标驱动的模仿学习
Goal-conditioned Imitation Learning
HTML
PDF
Yiming Ding, Carlos Florensa, Mariano Phielipp, Pieter Abbeel
TL;DR
通过整合演示(demonstrations)的方法,本研究探讨如何加速强化学习的收敛速度,以达到能够到达任何目标的策略,并且在与其他模仿学习算法训练的代理相比表现更好。
Abstract
Designing rewards for
reinforcement learning
(RL) is challenging because it needs to convey the desired task, be efficient to optimize, and be easy to compute. The latter is particularly problematic when applying RL to
→