BriefGPT.xyz
Jan, 2013
分解型MDPs的策略迭代
Policy Iteration for Factored MDPs
HTML
PDF
Daphne Koller, Ron Parr
TL;DR
该论文提出了一种新的价值确定方法,借助简单的闭合计算来直接计算价值函数的分解逼近,以及一个基于此方法的策略迭代过程。
Abstract
Many large
mdps
can be represented compactly using a
dynamic bayesian network
. Although the structure of the value function does not retain the structure of the process, recent work has shown that value functions
→