One of the fundamental quests of AI is to produce agents that coordinate well
with humans. This problem is challenging, especially in domains that lack high
quality human behavioral data, because multi-agent reinforcement learning (RL)
often converges to different equilibria from the o