BriefGPT.xyz
May, 2023
实用鲁棒强化学习:邻域不确定性集和双代理算法
On practical robust reinforcement learning: adjacent uncertainty set and double-agent algorithm
HTML
PDF
Ukjo Hwang, Songnam Hong
TL;DR
介绍了一种新的不确定性集合并基于此提出了一种名为ARQ-Learning的鲁棒强化学习方法,同时还提出一种能高效解决ARQ-Learning在大规模或连续状态空间下的问题的技术,最终将其应用于各种存在模型不确定性的强化学习应用中。
Abstract
robust reinforcement learning
(RL) aims at learning a policy that optimizes the worst-case performance over an
uncertainty set
. Given nominal Markov decision process (N-MDP) that generates samples for training, t
→