In this paper, we study the tiered reinforcement learning setting, a parallel
transfer learning framework, where the goal is to transfer knowledge from the
low-tier (source) task to the high-tier (target) task to reduce the exploration
risk of the latter while solving the two tasks in
本研究提出了一种名为 “Policy Transfer Framework” 的框架,该框架采用多策略转移方式对强化学习中的目标策略进行直接优化,可以很方便地与现有的深度强化学习方法相结合,实验结果表明,该框架明显加速了学习过程,并在离散和连续动作空间中超越了现有的策略转移方法,具有较高的学习效率和最终性能。