BriefGPT.xyz
May, 2014
近似策略迭代方案对比
Approximate Policy Iteration Schemes: A Comparison
HTML
PDF
Bruno Scherrer
TL;DR
本文考虑了马尔可夫决策过程所形式化的无限时间折扣率下的最优控制问题,研究了几种近似策略迭代算法,对它们进行了性能分析,显示了非静态策略迭代算法可以在内存和性能之间进行平衡。
Abstract
We consider the infinite-horizon discounted
optimal control
problem formalized by
markov decision processes
. We focus on several approximate variations of the
→