For problems requiring cooperation, many multiagent systems implement
solutions among either individual agents or across an entire population towards
a common goal. Multiagent teams are primarily studied when in
本文提出了一种新的算法,名为 Learning to Coordinate and Teach Reinforcement(LeCTR),通过在协作多智能体强化学习中使每个代理都学习何时提供何种建议,从而改善整个团队性能和学习效果。实证比较表明,我们的教学代理不仅学习速度更快,而且在现有方法失败的任务中也学会了协作。