BriefGPT.xyz
May, 2023
具有有限逃避者维度的基于模型的RL的均匀PAC保证
Uniform-PAC Guarantees for Model-Based RL with Bounded Eluder Dimension
HTML
PDF
Yue Wu, Jiafan He, Quanquan Gu
TL;DR
本研究提出了针对非线性赌博机和基于模型的的情境强化学习的算法,使用有界eluder维数的通用函数类,通过将每个行为分配到不同的级别,从而实现了统一的概率近似正确性(Uniform-PAC)保证。
Abstract
Recently, there has been remarkable progress in
reinforcement learning
(RL) with
general function approximation
. However, all these works only provide regret or sample complexity guarantees. It is still an open q
→