BriefGPT.xyz
Jun, 2011
部分可观察的马尔可夫决策过程的不逼近性结果
Nonapproximability Results for Partially Observable Markov Decision Processes
HTML
PDF
J. Goldsmith, C. Lusena, M. Mundhenk
TL;DR
研究表明,对于部分可观察的马尔可夫决策过程的多种变化形式,寻找控制策略的多项式时间算法难以或没有确保在最优解的常数因子或常数项内找到策略。除非某些复杂度类崩溃,否则任何控制策略设计师都必须在性能保证和有效计算之间做出选择。
Abstract
We show that for several variations of
partially observable markov decision processes
, polynomial-time algorithms for finding
control policies
are unlikely to or simply don't have guarantees of finding policies w
→