BriefGPT.xyz
Mar, 2022
异步、基于选项的多智能体策略梯度:一种条件推理方法
Multi-Agent Asynchronous Cooperation with Hierarchical Reinforcement Learning
HTML
PDF
Xubo Lyu, Amin Banitalebi-Dehkordi, Mo Chen, Yong Zhang
TL;DR
本文提出了一种条件推理方法,以解决多智能体协作任务中的高级行为空间集中控制和梯度获取问题,并在代表性的基于选项的多智能体协作任务上验证了其有效性。
Abstract
Hierarchical
multi-agent
reinforcement learning (MARL) has shown a significant learning efficiency by searching policy over higher-level, temporally extended actions (options). However, standard
policy gradient
-b
→