BriefGPT.xyz
Oct, 2021
带熵正则化的约束马尔可夫决策过程的双重方法
A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization
HTML
PDF
Donghao Ying, Yuhao Ding, Javad Lavaei
TL;DR
研究了采用软最大化参数化的熵正则化约束马尔可夫决策过程及其Lagrange对偶函数和约束违规等问题。并提出了加速对偶下降方法以实现全局收敛性。
Abstract
We study
entropy-regularized
constrained markov decision processes
(CMDPs) under the soft-max parameterization, in which an agent aims to maximize the
→