BriefGPT.xyz
Sep, 2017
利用演示克服强化学习中的探索问题
Overcoming Exploration in Reinforcement Learning with Demonstrations
HTML
PDF
Ashvin Nair, Bob McGrew, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel
TL;DR
本研究利用示范来解决强化学习中稀疏奖励的探索问题,成功地学习了长期、多步骤的机器人任务,方法使用了DDPG和HER算法,提供了一种在仿真机器人任务上比以往RL算法快一个数量级的加速,方法易于实现,能够解决在行为克隆和RL算法中 都无法解决的任务,并且往往表现优于示范策略。
Abstract
Exploration in environments with
sparse rewards
has been a persistent problem in
reinforcement learning
(RL). Many tasks are natural to specify with a sparse reward, and manually shaping a reward function can res
→