We propose a model enabling decentralized multiple agents to share their perception of environment in a fair and adaptive way. In our model, both the current message and historical observation are taken into account, and they are handled in the same recurrent model but in different forms. We present a dual-level recurrent communication framework for multi-agent systems, in which the first recurrence occurs in the communication sequence and is used to transmit communication data among agents, while the second recurrence is based on the time sequence and combines the historical observations for each agent. The developed communication flow separates communication messages from memories but allows agents to share their historical observations by the dual-level recurrence. This design makes agents adapt to changeable communication objects, while the communication results are fair to these agents. We provide a sufficient discussion about our method in both partially observable and fully observable environments. The results of several experiments suggest our method outperforms the existing decentralized communication frameworks and the corresponding centralized training method.

该研究提出了一种模型，可以实现分散的多个代理程序以公平适应的方式共享其对环境的感知。我们提出了一个双层递归通信框架，用于多代理系统，其中第一次循环出现在通信序列中并用于在代理之间传输通信数据，而第二次循环基于时间序列并结合每个代理的历史观察结果。该方法在部分可观测环境和完全可观测环境中提供了充分的讨论，多次实验结果表明该方法优于现有的分散通信框架和相应的集中训练方法。

一种基于双层循环的去中心化通信框架，用于多智能体强化学习