BriefGPT.xyz
Feb, 2023
基于模型的价值函数不确定性
Model-Based Uncertainty in Value Functions
HTML
PDF
Carlos E. Luis, Alessandro G. Bottero, Julia Vinogradska, Felix Berkenkamp, Jan Peters
TL;DR
在模型基强化学习中,我们考虑了如何量化累积奖励的不确定性,并提出了一种新的不确定Bellman方程来弥补现有工作的不足,该方法能够更准确地告诉我们此前探索的不足。实验表明,这种更精确的不确定性估计方法能够提高样本效率。
Abstract
We consider the problem of quantifying
uncertainty
over expected cumulative rewards in model-based
reinforcement learning
. In particular, we focus on characterizing the variance over values induced by a distribut
→