BriefGPT.xyz
Oct, 2011
关系马尔可夫决策过程的一阶决策图
First Order Decision Diagrams for Relational MDPs
HTML
PDF
Chenggang Wang, Saket Joshi, Roni Khardon
TL;DR
研究证明,使用新的紧凑表示——FODD,可以解决RMDPs,通过FODDs操作开发价值迭代算法,并证明该算法完全收敛且具有独立于领域大小或实例化的最佳策略。
Abstract
markov decision processes
capture sequential decision making under uncertainty, where an agent must choose actions so as to optimize long term
reward
. The paper studies efficient reasoning mechanisms for Relation
→