multi-agent reinforcement learning (MARL) methods find optimal policies for
agents that operate in the presence of other learning agents. Central to
achieving this is how the agents coordinate. One way to coordinate is by
learning to communicate with each other. Can the agents develop