BriefGPT.xyz
Oct, 2023
多任务强化学习在非马尔可夫决策过程中的可证明收益
Provable Benefits of Multi-task RL under Non-Markovian Decision Making Processes
HTML
PDF
Ruiquan Huang, Yuan Cheng, Jing Yang, Vincent Tan, Yingbin Liang
TL;DR
多任务强化学习在马尔可夫决策过程中的应用揭示了共享潜在结构可以显著提高对样本的利用效率,并探讨了在部分可观察的MDPs和预测状态表示中这种好处是否能扩展。
Abstract
In
multi-task reinforcement learning
(RL) under
markov decision processes
(MDPs), the presence of shared latent structures among multiple MDPs has been shown to yield significant benefits to the
→