BriefGPT.xyz
May, 2024
证明收敛性的风险敏感分布式强化学习的策略梯度方法
Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence
HTML
PDF
Minheng Xiao, Xian Yu, Lei Ying
TL;DR
该研究论文介绍了一种用于风险敏感分布式强化学习的策略梯度方法,以及一种基于分布式策略评估和轨迹梯度估计的分类分布式策略梯度算法(CDPG)。通过在随机悬崖环境上进行实验,展示了在分布式强化学习中考虑风险敏感性的益处。
Abstract
risk-sensitive reinforcement learning
(RL) is crucial for maintaining reliable performance in many high-stakes applications. While most RL methods aim to learn a point estimate of the random cumulative cost,
distributio
→