BriefGPT.xyz
May, 2022
关于在观测扰动下安全强化学习的鲁棒性
On the Robustness of Safe Reinforcement Learning under Observational Perturbations
HTML
PDF
Zuxin Liu, Zijian Guo, Zhepeng Cen, Huan Zhang, Jie Tan...
TL;DR
本文研究了安全强化学习中观测对抗攻击的安全性和鲁棒性,并提出了两种新方法以最大化代价或奖励来攻击目标,同时提出了一种鲁棒性训练框架。
Abstract
safe reinforcement learning
(RL) trains a policy to maximize the task reward while satisfying
safety constraints
. While prior works focus on the performance optimality, we find that the optimal solutions of many
→