BriefGPT.xyz
May, 2022
基于策略的原始对偶法用于凸约束马尔可夫决策过程
Policy-based Primal-Dual Methods for Convex Constrained Markov Decision Processes
HTML
PDF
Donghao Ying, Mengzi Guo, Yuhao Ding, Javad Lavaei, Zuo-Jun...
TL;DR
研究凸约束马尔可夫决策过程(CMDPs),提出基于策略的原始-对偶算法来解决优化问题和约束问题,通过隐藏在问题中的凸性证明了所提出的算法的全局收敛性,并以最优性差距和约束违规性表示,证明了算法的 $O(T^{-1/3})$ 收敛速度。
Abstract
We study
convex constrained markov decision processes
(CMDPs) in which the objective is concave and the constraints are convex in the state-action visitation distribution. We propose a
policy-based primal-dual algorithm
→