BriefGPT.xyz
Feb, 2024
在约束马尔可夫决策过程中实现 $\tilde{O}(1/ε)$ 的样本复杂性
Achieving $\tilde{O}(1/ε)$ Sample Complexity for Constrained Markov Decision Process
HTML
PDF
Jiashuo Jiang, Yinyu Ye
TL;DR
我们研究了强化学习问题中的约束马尔可夫决策过程(CMDP),并通过优化算法对CMDP问题的样本复杂度提出了改进,实现了优化的问题相关保证。
Abstract
We consider the
reinforcement learning
problem for the
constrained markov decision process
(CMDP), which plays a central role in satisfying safety or
→