BriefGPT.xyz
Mar, 2021
带有方差缩减的Greedy-GQ: 有限时间分析和改进的复杂度
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity
HTML
PDF
Shaocong Ma, Ziyi Chen, Yi Zhou, Shaofeng Zou
TL;DR
本文介绍了基于价值的增强学习中的一种算法——Greedy-GQ以及其演化版的VR-Greedy-GQ,通过降低方差,提高了算法的收敛速度,显著减小了误差,同时证明了算法的收敛性和较小的采样复杂度,最后还得出了实验结果。
Abstract
Greedy-GQ is a
value-based
reinforcement learning
(RL) algorithm for optimal control. Recently, the finite-time analysis of Greedy-GQ has been developed under linear function approximation and Markovian sampling,
→