BriefGPT.xyz
Dec, 2019
通过迭代监督学习学习实现目标
Learning To Reach Goals Without Reinforcement Learning
HTML
PDF
Dibya Ghosh, Abhishek Gupta, Justin Fu, Ashwin Reddy, Coline Devine...
TL;DR
本文介绍了一种强化学习算法,利用模仿学习从零开始获得目标达成策略,而不需要专家演示或价值函数,并通过该算法在多个基准任务中达到了比现有强化学习算法更好的目标达成性能和鲁棒性。
Abstract
imitation learning
algorithms provide a simple and straightforward approach for training control policies via supervised learning. By maximizing the likelihood of good actions provided by an expert demonstrator, supervised
→