BriefGPT.xyz
Jun, 2021
深度强化学习的马尔可夫状态抽象学习
Learning Markov State Abstractions for Deep Reinforcement Learning
HTML
PDF
Cameron Allen, Neev Parikh, Omer Gottesman, George Konidaris
TL;DR
该研究提出了一种学习马尔科夫状态抽象表示的新方法,结合逆向模型估计和时态对比学习,可以提高强化学习中的样本效率。
Abstract
The fundamental assumption of
reinforcement learning
in
markov decision processes
(MDPs) is that the relevant decision process is, in fact, Markov. However, when MDPs have rich observations, agents typically lear
→