BriefGPT.xyz
Aug, 2024
广义高斯时间差误差对不确定性感知强化学习的应用
Generalized Gaussian Temporal Difference Error For Uncertainty-aware Reinforcement Learning
HTML
PDF
Seyeon Kim, Joonhun Lee, Namhoon Cho, Sungjun Han, Seungeon Baek
TL;DR
本研究针对传统不确定性感知时间差学习方法的误差表征和不确定性估计问题,提出了一种新的广义高斯误差建模框架。该框架通过引入高阶矩,特别是峰度,提高了数据依赖噪声的估计和减轻效果,进而在政策梯度算法中表现出显著的性能提升。
Abstract
Conventional
Uncertainty
-aware temporal difference (TD) learning methods often rely on simplistic assumptions, typically including a zero-mean
Gaussian Distribution
for TD errors. Such oversimplification can lead
→