ICMLFeb, 2022
Saute RL: 使用状态增广实现近乎绝对安全的强化学习
Saute RL: Almost Surely Safe Reinforcement Learning Using State Augmentation
Aivar Sootla, Alexander I. Cowen-Rivers, Taher Jafferjee, Ziyan Wang, David Mguni...
TL;DRSaute MDP can remove safety constraints by augmenting state-space and reshaping objective, allowing for policy generalization and better constraint satisfaction in reinforcement learning.