Recently, there has been a great deal of research in emergent communication
on artificial agents interacting in simulated environments. Recent studies have
revealed that, in general, emergent languages do not follow the
compositionality patterns of natural language. To deal with this, existing
works have proposed a limited channel capacity as an important constraint for
learning highly compositional languages. In this paper, we show that this is
not a sufficient condition and propose an intrinsic reward framework for
improving compositionality in emergent communication. We use a reinforcement
learning setting with two agents -- a \textit{task-aware} Speaker and a
\textit{state-aware} Listener that are required to communicate to perform a set
of tasks. Through our experiments on three different referential game setups,
including a novel environment gComm, we show intrinsic rewards improve
compositionality scores by $\approx \mathbf{1.5-2}$ times that of existing
frameworks that use limited channel capacity.

本篇论文提出一种内在奖励框架，通过强化学习设置两个代理，以在三个不同的指称游戏环境下将有限通道容量与内在奖励相结合，提高新颖环境下组合性得分约 1.5-2 倍。