TL;DR本文提出了一种新的算法Forward-PECVaR,用于确切评估具有非均匀成本的CVaR-SSPs的稳态策略,并通过实证评估CVaR Value Iteration算法的质量以及算法参数对解决方案的质量和可伸缩性的影响。
Abstract
The stochastic shortest path (SSP) problem models probabilistic sequential-decision problems where an agent must pursue a goal while minimizing a cost function. Because of the probabilistic dynamics, it is desired to have a cost function that considers risk. →