ICMLFeb, 2022

Saute RL: 使用状态增广实现近乎绝对安全的强化学习

TL;DRSaute MDP can remove safety constraints by augmenting state-space and reshaping objective, allowing for policy generalization and better constraint satisfaction in reinforcement learning.