Robust Reinforcement Learning (RL) focuses on improving performances under model errors or adversarial attacks, which facilitates the real-life deployment of RL agents. Robust Adversarial Reinforcement Learning (RARL) is one of the most popular frameworks for robust RL. However, most of the existing literature models RARL as a zero-sum simultaneous game with Nash equilibrium as the solution concept, which could overlook the sequential nature of RL deployments, produce overly conservative agents, and induce training instability. In this paper, we introduce a novel hierarchical formulation of robust RL - a general-sum Stackelberg game model called RRL-Stack - to formalize the sequential nature and provide extra flexibility for robust training. We develop the Stackelberg Policy Gradient algorithm to solve RRL-Stack, leveraging the Stackelberg learning dynamics by considering the adversary's response. Our method generates challenging yet solvable adversarial environments which benefit RL agents' robust learning. Our algorithm demonstrates better training stability and robustness against different testing conditions in the single-agent robotics control and multi-agent highway merging tasks.

本文介绍了一种用于强化学习的Stackelberg游戏模型——RRL-Stack，旨在提供额外的鲁棒性训练和解决目前RL训练中存在的过度保守智能及训练不稳定等问题，并提出了一种基于Stackelberg Policy Gradient算法的解决方案，在单一和多智能体任务中展现更好的训练稳定性和鲁棒性。

通过适应性规则对抗训练实现史塔克伯格博弈的坚韧强化学习