BriefGPT.xyz
Jul, 2023
安全强化学习作为Wasserstein变分推理:可解释性的形式方法
Safe Reinforcement Learning as Wasserstein Variational Inference: Formal Methods for Interpretability
HTML
PDF
Yanran Wang, David Boyle
TL;DR
本研究提出了一种新颖的自适应Wasserstein变分优化(AWaVO)方法,利用正式方法提供奖励设计、训练收敛的透明度和顺序决策的概率解释,解决了序列决策问题中奖励函数的解释和相应最优策略的挑战。
Abstract
reinforcement learning
or optimal control can provide effective reasoning for
sequential decision-making
problems with variable dynamics. Such reasoning in practical implementation, however, poses a persistent ch
→