BriefGPT.xyz
Mar, 2021
使用线性函数逼近的无限时域离线强化学习:维度诅咒与算法
Infinite-Horizon Offline Reinforcement Learning with Linear Function Approximation: Curse of Dimensionality and Algorithm
HTML
PDF
Lin Chen, Bruno Scherrer, Peter L. Bartlett
TL;DR
本文研究线性函数逼近下无穷时域离线强化学习的策略评估的样本复杂性以及分布漂移假设下的算法,提出了算法的样本复杂性的下界,以及样本复杂性的上界。
Abstract
In this paper, we investigate the
sample complexity
of
policy evaluation
in infinite-horizon offline reinforcement learning (also known as the off-
→