factored mdp | BriefGPT - AI 论文速递

关键词factored mdp

搜索结果 - 1

分解马尔可夫决策过程中近最优强化学习
通过采用 posterior sampling reinforcement learning (PSRL) 算法和 upper confidence bound algorithm (UCRL-Factored) 算法，在已知为 facto
PDF10 years ago