BriefGPT.xyz
Jan, 2023
通过本地规划实现样本高效深度强化学习
Sample Efficient Deep Reinforcement Learning via Local Planning
HTML
PDF
Dong Yin, Sridhar Thiagarajan, Nevena Lazic, Nived Rajaraman, Botao Hao...
TL;DR
本文提出了一种名为“不确定性优先本地规划”的算法框架,结合模拟器的属性,在每一次数据收集迭代中,以一定概率将环境重置到高度不确定性的已观测状态,这样可以显著提高几个基准强化学习算法在困难的探索任务上的样本成本,并在 Atari 游戏Montezuma's Revenge中实现了超人类性能。
Abstract
The focus of this work is
sample-efficient deep reinforcement learning
(RL) with a simulator. One useful property of
simulators
is that it is typically easy to reset the environment to a previously observed state
→