BriefGPT.xyz
May, 2023
无模型鲁棒平均奖励强化学习
Model-Free Robust Average-Reward Reinforcement Learning
HTML
PDF
Yue Wang, Alvaro Velasquez, George Atia, Ashley Prater-Bennette, Shaofeng Zou
TL;DR
该研究主要关注如何处理模型不确定性对于Markov决策进程的影响。研究提出了两个无模型算法并探讨了常用的不确定性集合。
Abstract
robust markov decision processes
(MDPs) address the challenge of
model uncertainty
by optimizing the worst-case performance over an uncertainty set of MDPs. In this paper, we focus on the robust
→