Although risk awareness is fundamental to an online operating agent, it has
received less attention in the challenging continuous domain and under partial
observability. This paper presents a novel formulation and solution for
risk-averse belief-dependent probabilistically constrained continuous POMDP. We
tackle a demanding setting of belief-dependent reward and constraint operators.
The probabilistic confidence parameter makes our formulation genuinely
risk-averse and much more flexible than the state-of-the-art chance constraint.
Our rigorous analysis shows that in the stiffest probabilistic confidence case,
our formulation is very close to chance constraint. However, our probabilistic
formulation allows much faster and more accurate adaptive acceptance or pruning
of actions fulfilling or violating the constraint. In addition, with an
arbitrary confidence parameter, we did not find any analogs to our approach. We
present algorithms for the solution of our formulation in continuous domains.
We also uplift the chance-constrained approach to continuous environments using
importance sampling. Moreover, all our presented algorithms can be used with
parametric and nonparametric beliefs represented by particles. Last but not
least, we contribute, rigorously analyze and simulate an approximation of
chance-constrained continuous POMDP. The simulations demonstrate that our
algorithms exhibit unprecedented celerity compared to the baseline, with the
same performance in terms of collisions.

本研究针对部分可观察领域的连续 POMDP 问题，提出了一种新的风险厌恶且基于信念的概率限制解决方案，并给出了对应的算法。通过对信念相关的奖励和约束算子的处理，本文提出的方法在满足相同约束条件下，比现有技术更加风险厌恶、更加灵活。实验结果表明，该方法在解决连续 POMDP 问题中具有显著的优势。