For multi-agent reinforcement learning systems (MARLS), the problem formulation generally involves investing massive reward engineering effort specific to a given problem. However, this effort often cannot be translated to other problems; worse, it gets wasted when system dynamics change drastically. This problem is further exacerbated in sparse reward scenarios, where a meaningful heuristic can assist in the policy convergence task. We propose GOVerned Reward Engineering Kernels (GOV-REK), which dynamically assign reward distributions to agents in MARLS during its learning stage. We also introduce governance kernels, which exploit the underlying structure in either state or joint action space for assigning meaningful agent reward distributions. During the agent learning stage, it iteratively explores different reward distribution configurations with a Hyperband-like algorithm to learn ideal agent reward models in a problem-agnostic manner. Our experiments demonstrate that our meaningful reward priors robustly jumpstart the learning process for effectively learning different MARL problems.

多智能体强化学习系统中，我们提出了GOVerned Reward Engineering Kernels (GOV-REK)方法，通过为智能体分配动态奖励分布来解决奖励工程问题和稀疏奖励场景下的政策收敛任务，使用Hyperband-like算法以问题无关的方式学习理想的智能体奖励模型。实验结果表明，我们的方法能够有效地加速学习过程并处理不同的MARL问题。

GOV-REK：设计鲁棒多智能体强化学习系统的受管理奖励构筑核心