BriefGPT.xyz
Jun, 2022
快速策略迁移的相关策略转化优化
Relative Policy-Transition Optimization for Fast Policy Transfer
HTML
PDF
Lei Han, Jiawei Xu, Cheng Zhou, Yizheng Zhang, Zhengyou Zhang
TL;DR
研究基于马尔科夫决策过程之间的策略迁移问题,引入一个引理来衡量两个任意MDPs之间的相关性,并提出RPO和RTO两种新算法以及相对策略转移优化(RPTO)的完整算法,通过在OpenAI gym的经典控制任务上创建具有不同动力学的策略迁移问题来验证其有效性。
Abstract
We consider the problem of
policy transfer
between two
markov decision processes
(MDPs). We introduce a lemma based on existing theoretical results in reinforcement learning (RL) to measure the relativity between
→