多导师强化学习

Apr, 2017

Multi-Advisor Reinforcement Learning

Romain Laroche, Mehdi Fatemi, Joshua Romoff, Harm van Seijen

TL;DR通过分布式学习将单智能体 RL 问题分配给多个学习者，并使用本地化规划策略，引入了一种新的基于共情策略的解决方案，并在实验中验证了其在果实收集任务上的效果。

Abstract

This article deals with a novel branch of Separation of Concerns, called Multi-Advisor Reinforcement Learning (MAd-RL), where a single-agent rl problem is distributed to $n$ learners, called advisors. Each advisor tries to solve the problem with a different focus. Their advice is then