BriefGPT.xyz
Oct, 2020
优先级别回放
Prioritized Level Replay
HTML
PDF
Minqi Jiang, Ed Grefenstette, Tim Rocktäschel
TL;DR
本研究提出了一种名为PLR的新方法,在深度强化学习中使用优先级重放机制来选择下一个训练级别,通过适当的训练级别采样,PLR在Procgen基准测试上显著提高样本效率和泛化能力,并超过了以前的最佳结果。
Abstract
Simulated environments with
procedurally generated content
have become popular benchmarks for testing systematic
generalization
of reinforcement learning agents. Every level in such an environment is algorithmica
→