BriefGPT.xyz
Jun, 2019
学习部分可观测环境的因果状态表示
Learning Causal State Representations of Partially Observable Environments
HTML
PDF
Amy Zhang, Zachary C. Lipton, Luis Pineda, Kamyar Azizzadenesheli, Anima Anandkumar...
TL;DR
本文提出了一种基于循环神经网络(RNN)的近似因果状态算法,该方法学习从POMDP中的历史动作和观察预测未来观察情况的因果状态表示。实验证明,所学习的状态表示可用于有效学习具有丰富观察空间的强化学习问题,并与之前的方法进行比较。
Abstract
intelligent agents
can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose mechanisms to approximate
causal states
, which optimally compress the joint histo
→