BriefGPT.xyz
Nov, 2022
GEC:MDP、POMDP及更广泛情境下交互式决策的统一框架
A Posterior Sampling Framework for Interactive Decision Making
HTML
PDF
Han Zhong, Wei Xiong, Sirui Zheng, Liwei Wang, Zhaoran Wang...
TL;DR
我们研究了基于互动决策制定的样本有效强化学习,提出了广义Eluder系数作为复杂度度量,并通过后验采样算法在完全可观察和部分可观察的环境下实现模型自由和模型为基础的学习,在探索和开发之间建立了基本权衡。
Abstract
We study
sample efficient reinforcement learning
(RL) under the general framework of
interactive decision making
, which includes Markov decision process (MDP), partially observable Markov decision process (POMDP)
→