BriefGPT.xyz
Jun, 2017
强化学习中基于特征空间的计数探索
Count-Based Exploration in Feature Space for Reinforcement Learning
HTML
PDF
Jarryd Martin, Suraj Narayanan Sasikumar, Tom Everitt, Marcus Hutter
TL;DR
本文介绍了一种新的计数乐观探索算法,可在高维状态-动作空间中使用,并提出了一个新方法来计算泛化状态的访问次数,从而解决了限制训练经验进行广义状态估计的问题。实验表明,该算法在高维RL基准测试中取得了接近最新的结果,且计算代价较低。
Abstract
We introduce a new count-based optimistic
exploration
algorithm for
reinforcement learning
(RL) that is feasible in environments with
high-dimens
→