BriefGPT.xyz
May, 2023
竞争自学时学习新兴行为的Stackelberg博弈
Stackelberg Games for Learning Emergent Behaviors During Competitive Autocurricula
HTML
PDF
Boling Yang, Liyuan Zheng, Lillian J. Ratliff, Byron Boots, Joshua R. Smith
TL;DR
使用 Stackelberg Multi-Agent Deep Deterministic Policy Gradient (ST-MADDPG) 的算法,优化自我进化过程中的智能体沟通模式,提高多智能体学习的有效性和鲁棒性。
Abstract
autocurricular training
is an important sub-area of
multi-agent reinforcement learning
~(MARL) that allows multiple agents to learn
emergent skill
→