BriefGPT.xyz
Jul, 2019
关于强化学习中的困难探索:Pommerman的案例研究
On Hard Exploration for Reinforcement Learning: a Case Study in Pommerman
HTML
PDF
Chao Gao, Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor
TL;DR
本研究研究了如何在具有稀疏、延迟和欺骗性回报的域中进行最佳探索,通过分析Pommerman的难度,提出了一种基于模型的自动推理模块,可以用于更安全的探索,通过实验证明了该模块可以显著提高学习效果。
Abstract
How to best explore in domains with sparse, delayed, and deceptive rewards is an important open problem for
reinforcement learning
(RL). This paper considers one such domain, the recently-proposed multi-agent benchmark of
→