BriefGPT.xyz
Jan, 2024
带线性函数逼近的正则化 Q 学习
Regularized Q-Learning with Linear Function Approximation
HTML
PDF
Jiachen Xi, Alfredo Garcia, Petar Momcilovic
TL;DR
通过在有限时间内收敛到线性函数逼近情况下的投影贝尔曼误差的单环路算法,本文提出的算法在马尔科夫噪声存在的情况下收敛于稳定点,并为该算法衍生的策略提供性能保证。
Abstract
Several successful
reinforcement learning
algorithms make use of
regularization
to promote
multi-modal policies
that exhibit enhanced expl
→