BriefGPT.xyz
Jun, 2020
具有特征映射的折扣 MDP 的可证明高效强化学习
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping
HTML
PDF
Dongruo Zhou, Jiafan He, Quanquan Gu
TL;DR
本论文介绍了一种基于特性映射的新算法,能够以线性的方式参数化转移核函数来处理强化学习中的大状态和行动空间,并且证明了该算法在一些强化学习的问题中,不需要访问生成模型就能取得多项式的最优后悔值,且总体上是近乎最优的。
Abstract
Modern tasks in
reinforcement learning
are always with large state and action spaces. To deal with them efficiently, one often uses predefined
feature mapping
to represents states and actions in a low dimensional
→