BriefGPT.xyz
Dec, 2023
强化学习中的保护进展:用于控制策略合成的安全贝叶斯探索
Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis
HTML
PDF
Rohan Mitta, Hosein Hasanbeig, Jun Wang, Daniel Kroening, Yiannis Kantaros...
TL;DR
这篇论文研究了在强化学习过程中如何保证训练的安全性,通过提出一种新的架构处理效率和安全性之间的权衡,并利用贝叶斯推理和马尔可夫决策过程来近似风险,并通过实验结果展示了整体架构的性能。
Abstract
This paper addresses the problem of maintaining safety during training in
reinforcement learning
(RL), such that the
safety constraint violations
are bounded at any point during learning. In a variety of RL appli
→