BriefGPT.xyz
Sep, 2011
风险敏感强化学习应用于约束条件控制
Risk-Sensitive Reinforcement Learning Applied to Control under Constraints
HTML
PDF
P. Geibel, F. Wysotzki
TL;DR
本文研究带错误状态的马尔可夫决策过程,并提出了基于风险和价值函数的启发式强化学习算法用于优化控制任务,实验结果表明该算法可以在模型假设被放宽的情况下成功应用于控制任务。
Abstract
In this paper, we consider
markov decision processes
(MDPs) with
error states
.
error states
are those states entering which is undesirable
→