BriefGPT.xyz
Jun, 2023
安全关键强化学习的概率约束
Probabilistic Constraint for Safety-Critical Reinforcement Learning
HTML
PDF
Weiqin Chen, Dharmashankar Subramanian, Santiago Paternain
TL;DR
本文探讨了在概率受限制的强化学习中学习安全策略的问题,并提出了两种算法——Safe Policy Gradient-REINFORCE和SPG-Actor-Critic以及Safe Primal-Dual算法来解决。通过实验,验证了这些方法的有效性和优越性。
Abstract
In this paper, we consider the problem of learning safe policies for
probabilistic-constrained reinforcement learning
(RL). Specifically, a
safe policy
or controller is one that, with high probability, maintains
→