BriefGPT.xyz
Feb, 2014
大规模马尔可夫决策问题的线性规划
Linear Programming for Large-Scale Markov Decision Problems
HTML
PDF
Yasin Abbasi-Yadkori, Peter L. Bartlett, Alan Malek
TL;DR
本文考虑了控制具有大状态空间的马尔可夫决策过程以最小化平均成本的问题,并使用线性规划和两种方法,即基于随机凸优化和基于约束采样的方法,将性能提高到与在低维策略类中的任何策略相比的最佳水平。
Abstract
We consider the problem of controlling a
markov decision process
(MDP) with a large state space, so as to minimize
average cost
. Since it is intractable to compete with the optimal policy for large scale problems
→