抽象学习模型规划与可迁移子任务学习

Dec, 2019

抽象学习模型规划与可迁移子任务学习

Planning with Abstract Learned Models While Learning Transferable Subtasks

John Winder, Stephanie Milani, Matthew Landen, Erebus Oh, Shane Parr...

TL;DR该研究利用一种新的形式结构，提出了一种基于模型的层次强化学习算法，名为PALM，可学习独立、模块化的转移和奖励模型用于概率规划，并演示了其将规划和执行进行集成，以快速有效地学习抽象、分层模型以及转移至新的相关任务的增强潜力。

Abstract

We introduce an algorithm for model-based hierarchical reinforcement learning to acquire self-contained transition and reward models suitable for probabilistic planning at multiple levels of abstraction. We call