BriefGPT.xyz
Jun, 2024
时钟受限的鲁棒马尔科夫决策过程
Time-Constrained Robust MDPs
HTML
PDF
Adil Zouitine, David Bertoin, Pierre Clavier, Matthieu Geist, Emmanuel Rachelson
TL;DR
通过引入新的时间约束鲁棒马尔科夫决策过程(TC-RMDP)表达方式,考虑到多因素、相关性和时变干扰,该研究重新审视了鲁棒强化学习中的传统假设,为发展更实际、更真实的强化学习应用开辟了新的路径,同时在时间受限环境下,在保持鲁棒性的同时,取得了性能和鲁棒性之间的高效平衡。
Abstract
robust reinforcement learning
is essential for deploying reinforcement learning algorithms in real-world scenarios where environmental uncertainty predominates. Traditional
robust reinforcement learning
often dep
→