BriefGPT.xyz
Nov, 2023
强化学习通用函数逼近的近乎最优低切换算法
A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation
HTML
PDF
Heyang Zhao, Jiafan He, Quanquan Gu
TL;DR
我们提出了新算法MQL-UCB,通过探索-利用困境实现了具有函数逼近的强化学习,解决了切换策略的成本和函数类复杂性的问题,同时在历史轨迹中利用了高数据效率,实现了最小化遗憾和最优切换成本。
Abstract
The
exploration-exploitation dilemma
has been a central challenge in
reinforcement learning
(RL) with complex model classes. In this paper, we propose a new algorithm,
→