Cooperative multi-agent reinforcement learning (MARL) has been an increasingly important research topic in the last half-decade because of its great potential for real-world applications. Because of the curse of dimensionality, the popular "centralized training decentralized execution" framework requires a long time in training, yet still cannot converge efficiently. In this paper, we propose a general training framework, MARL-LNS, to algorithmically address these issues by training on alternating subsets of agents using existing deep MARL algorithms as low-level trainers, while not involving any additional parameters to be trained. Based on this framework, we provide three algorithm variants based on the framework: random large neighborhood search (RLNS), batch large neighborhood search (BLNS), and adaptive large neighborhood search (ALNS), which alternate the subsets of agents differently. We test our algorithms on both the StarCraft Multi-Agent Challenge and Google Research Football, showing that our algorithms can automatically reduce at least 10% of training time while reaching the same final skill level as the original algorithm.

合作多智能体强化学习是一个在过去五年中越来越重要的研究主题，因其在现实世界中的巨大应用潜力。本文提出了一个通用的训练框架MARL-LNS，通过在交替的智能体子集上进行训练，并使用现有的深度MARL算法作为底层训练器来解决维度灾难的问题，而不需要额外的参数进行训练。基于该框架，我们提供了三种算法变种：随机大邻域搜索（RLNS），批量大邻域搜索（BLNS）和自适应大邻域搜索（ALNS），这些算法以不同的方式交替使用智能体子集。我们在StarCraft Multi-Agent Challenge和Google Research Football上测试了我们的算法，并证明我们的算法可以自动减少至少10％的训练时间，同时达到与原始算法相同的最终技能水平。

MARL-LNS：基于大型邻域搜索的合作多智能体强化学习