BriefGPT.xyz
May, 2023
方差减少的分布鲁棒Q学习的样本复杂性
Sample Complexity of Variance-reduced Distributionally Robust Q-learning
HTML
PDF
Shengbo Wang, Nian Si, Jose Blanchet, Zhengyuan Zhou
TL;DR
本论文提出了两种模型无关的算法,分别是分布鲁棒的Q-learning及其方差缩减的版本。这些算法可在处理分布移位时有效地学习强大的策略。在一系列数值实验中,这些算法的理论发现和效率得到了证实。
Abstract
dynamic decision making
under
distributional shifts
is of fundamental interest in theory and applications of
reinforcement learning
: The d
→