Handling the problem of scalability is one of the essential issues for multi-agent reinforcement learning (MARL) algorithms to be applied to real-world problems typically involving massively many agents. For this, parameter sharing across multiple agents has widely been used since it reduces the training time by decreasing the number of parameters and increasing the sample efficiency. However, using the same parameters across agents limits the representational capacity of the joint policy and consequently, the performance can be degraded in multi-agent tasks that require different behaviors for different agents. In this paper, we propose a simple method that adopts structured pruning for a deep neural network to increase the representational capacity of the joint policy without introducing additional parameters. We evaluate the proposed method on several benchmark tasks, and numerical results show that the proposed method significantly outperforms other parameter-sharing methods.

本文提出了一种基于结构剪枝的深度神经网络方法,旨在增加联合策略的表示能力从而在多智能体强化学习中减少共享参数对不同行为任务的性能影响。多项基准测试表明所提方法相比共享参数方法具有显著的提高。

网络修剪参数共享的可扩展多智能体深度强化学习