In recent advancements in Multi-agent Reinforcement Learning (MARL), its
application has extended to various safety-critical scenarios. However, most
methods focus on online learning, which presents substantial risks when
deployed in real-world settings. Addressing this challenge, we introduce an
innovative framework integrating diffusion models within the MARL paradigm.
This approach notably enhances the safety of actions taken by multiple agents
through risk mitigation while modeling coordinated action. Our framework is
grounded in the Centralized Training with Decentralized Execution (CTDE)
architecture, augmented by a Diffusion Model for prediction trajectory
generation. Additionally, we incorporate a specialized algorithm to further
ensure operational safety. We evaluate our model against baselines on the DSRL
benchmark. Experiment results demonstrate that our model not only adheres to
stringent safety constraints but also achieves superior performance compared to
existing methodologies. This underscores the potential of our approach in
advancing the safety and efficacy of MARL in real-world applications.

最近在多智能体强化学习（MARL）的进展中，其应用已扩展到各种安全关键场景。然而，大多数方法专注于在线学习，在实际环境中部署时存在重大风险。为了解决这一挑战，我们引入了一个创新的框架，将扩散模型与 MARL 范式相结合。通过风险缓解来增强多智能体采取的动作的安全性，并建模协同行动。我们的框架基于分散执行（CTDE）架构中的集中式训练，扩展了用于预测轨迹生成的扩散模型。此外，我们还结合了一种专门的算法来确保操作安全性。我们在 DSRL 基准测试上评估了我们的模型，并进行了对比实验。实验结果表明，我们的模型不仅符合严格的安全约束，而且在性能上优于现有的方法。这凸显了我们的方法在推动 MARL 在实际应用中的安全性和效能方面的潜力。