We consider the task of visual indoor exploration with multiple agents, where the agents need to cooperatively explore the entire indoor region using as few steps as possible. Classical planning-based methods often suffer from particularly expensive computation at each inference step and a limited expressiveness of cooperation strategy. By contrast, reinforcement learning (RL) has become a trending paradigm for tackling this challenge due to its modeling capability of arbitrarily complex strategies and minimal inference overhead. We extend the state-of-the-art single-agent RL solution, Active Neural SLAM (ANS), to the multi-agent setting by introducing a novel RL-based global-goal planner, Spatial Coordination Planner (SCP), which leverages spatial information from each individual agent in an end-to-end manner and effectively guides the agents to navigate towards different spatial goals with high exploration efficiency. SCP consists of a transformer-based relation encoder to capture intra-agent interactions and a spatial action decoder to produce accurate goals. In addition, we also implement a few multi-agent enhancements to process local information from each agent for an aligned spatial representation and more precise planning. Our final solution, Multi-Agent Active Neural SLAM (MAANS), combines all these techniques and substantially outperforms 4 different planning-based methods and various RL baselines in the photo-realistic physical testbed, Habitat.

本研究提出了一种基于强化学习的算法，引入了新颖的多智能体规划模块 MSP 和空间平移变换器 Spatial-TeamFormer，实现了多智能体协作视觉探测，经过政策蒸馏提取的元策略大大提高了最终策略的泛化能力，并在一个真实的 3D 模拟器 Habitat 中表现出比经典规划方法更好的性能。

高效多智能体合作视觉探索学习