BriefGPT.xyz
Jul, 2023
安全强化学习的概率反例指导
Probabilistic Counterexample Guidance for Safer Reinforcement Learning
HTML
PDF
Xiaotong Ji, Antonio Filieri
TL;DR
本文提出了一个针对安全探索的方法,通过与安全需求反例指导训练,将连续和离散状态空间系统抽象成紧凑的抽象模型,并利用概率反例生成构造出最小化安全需求违规的模拟子模型,从而使代理人能够有效地训练其策略,以在随后的在线探索过程中尽量减少安全违规风险。
Abstract
safe exploration
aims at addressing the limitations of
reinforcement learning
(RL) in safety-critical scenarios, where failures during trial-and-error learning may incur high costs. Several methods exist to incor
→