BriefGPT.xyz
Dec, 2022
多智能体强化学习中的谱归一化效应
Effects of Spectral Normalization in Multi-agent Reinforcement Learning
HTML
PDF
Kinal Mehta, Anuj Mahajan, Pawan Kumar
TL;DR
本论文探讨了如何在多智能体稀疏奖励场景下学习可靠的评论家,在此基础上研究了如何通过使用谱归一化技术对评论家进行规范化处理,提高对于即使在此复杂的SMAC和RWARE领域中都能够更加稳定学习的能力。
Abstract
A
reliable critic
is central to
on-policy actor-critic learning
. But it becomes challenging to learn a
reliable critic
in a
→