BriefGPT.xyz
Jun, 2023
凸优化下风险敏感型无行动者策略
Risk-sensitive Actor-free Policy via Convex Optimization
HTML
PDF
Ruoqi Zhang, Jens Sjölund
TL;DR
本研究提出了一种基于条件风险的风险敏感型目标函数,并使用输入凸神经网络对其建模,以实现与动作的凸性和简单梯度跟踪方法相关的全局最优动作的识别,该方法在维护有效的风险控制方面表现出了显著的效果。
Abstract
Traditional
reinforcement learning
methods optimize agents without considering
safety
, potentially resulting in unintended consequences. In this paper, we propose an optimal actor-free policy that optimizes a
→