BriefGPT.xyz
Jan, 2023
通过冻结慢状态实现更快的近似动态规划
Faster Approximate Dynamic Programming by Freezing Slow States
HTML
PDF
Yijia Wang, Daniel R. Jiang
TL;DR
论文提出了一种基于动态规划算法框架的近似方法,针对具有快慢结构的无穷状态空间的马尔可夫决策过程,其中“冻结”慢状态,通过解决一组简单的有限时段MDP以及在一个慢时间尺度(上层MDP)上进行价值迭代的辅助MDP等步骤,生成有效策略,以更少的计算代价实现了决策建模中遗漏慢状态的可行性。
Abstract
We consider infinite horizon
markov decision processes
(MDPs) with
fast-slow structure
, meaning that certain parts of the state space move "fast" (and in a sense, are more influential) while other parts transitio
→