使用变化状态表的高效基于模型的深度强化学习

Feb, 2018

使用变化状态表的高效基于模型的深度强化学习

Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation

Dane Corneil, Wulfram Gerstner, Johanni Brea

TL;DR使用VaST的优先级扫描规划方法，提高强化学习智能体的样本效率。在3D导航等任务中，VaST能够快速学习并有效地适应奖励或过渡概率的突然变化。

Abstract

Modern reinforcement learning algorithms reach super-human performance in many board and video games, but they are sample inefficient, i.e. they typically require significantly more playing experience than humans to reach an equal performance level. To improve →