When deploying autonomous agents in the real world, we need effective ways of
communicating objectives to them. Traditional skill learning has revolved
around reinforcement and imitation learning, each with rigid
本文提出了一种新的算法,名为 Learning to Coordinate and Teach Reinforcement(LeCTR),通过在协作多智能体强化学习中使每个代理都学习何时提供何种建议,从而改善整个团队性能和学习效果。实证比较表明,我们的教学代理不仅学习速度更快,而且在现有方法失败的任务中也学会了协作。