BriefGPT.xyz
Oct, 2021
有限数据下的Atari游戏掌握
Mastering Atari Games with Limited Data
HTML
PDF
Weirui Ye, Shaohuai Liu, Thanard Kurutach, Pieter Abbeel, Yang Gao
TL;DR
EfficientZero是一种基于MuZero的样本有效的模型图像强化学习算法,在只有两个小时实时游戏体验的情况下,在Atari 100k基准测试上取得了194.3%的人类平均性能和109.0%的中位性能,并且在一些DMControl 100k基准测试中胜过了state SAC算法,是第一次用如此少的数据量实现超越人类的Atari游戏算法。
Abstract
reinforcement learning
has achieved great success in many applications. However,
sample efficiency
remains a key challenge, with prominent methods requiring millions (or even billions) of environment steps to tra
→