Training agents in multi-agent competitive games presents significant challenges due to their intricate nature. These challenges are exacerbated by dynamics influenced not only by the environment but also by opponents' strategies. Existing methods often struggle with slow convergence and instability. To address this, we harness the potential of imitation learning to comprehend and anticipate opponents' behavior, aiming to mitigate uncertainties with respect to the game dynamics. Our key contributions include: (i) a new multi-agent imitation learning model for predicting next moves of the opponents -- our model works with hidden opponents' actions and local observations; (ii) a new multi-agent reinforcement learning algorithm that combines our imitation learning model and policy training into one single training process; and (iii) extensive experiments in three challenging game environments, including an advanced version of the Star-Craft multi-agent challenge (i.e., SMACv2). Experimental results show that our approach achieves superior performance compared to existing state-of-the-art multi-agent RL algorithms.

我们提出了一种新的多智能体模仿学习模型，用于预测对手的下一步动作，并将其与策略训练结合为一个训练过程的多智能体强化学习算法，在三个具有挑战性的游戏环境中进行了广泛实验，结果表明我们的方法在性能上优于现有的多智能体强化学习算法。

模仿以获胜：多智能竞争游戏中的模仿学习策略