Training agents in multi-agent competitive games presents significant
challenges due to their intricate nature. These challenges are exacerbated by
dynamics influenced not only by the environment but also by opponents'
strategies. Existing methods often struggle with slow convergence and
instability. To address this, we harness the potential of imitation learning to
comprehend and anticipate opponents' behavior, aiming to mitigate uncertainties
with respect to the game dynamics. Our key contributions include: (i) a new
multi-agent imitation learning model for predicting next moves of the opponents
-- our model works with hidden opponents' actions and local observations; (ii)
a new multi-agent reinforcement learning algorithm that combines our imitation
learning model and policy training into one single training process; and (iii)
extensive experiments in three challenging game environments, including an
advanced version of the Star-Craft multi-agent challenge (i.e., SMACv2).
Experimental results show that our approach achieves superior performance
compared to existing state-of-the-art multi-agent RL algorithms.

我们提出了一种新的多智能体模仿学习模型，用于预测对手的下一步动作，并将其与策略训练结合为一个训练过程的多智能体强化学习算法，在三个具有挑战性的游戏环境中进行了广泛实验，结果表明我们的方法在性能上优于现有的多智能体强化学习算法。