BriefGPT.xyz
Jun, 2020
基于模型的强化学习与价值目标回归
Model-Based Reinforcement Learning with Value-Targeted Regression
HTML
PDF
Alex Ayoub, Zeyu Jia, Csaba Szepesvari, Mengdi Wang, Lin F. Yang
TL;DR
本文研究基于模型的强化学习中的后悔最小化问题,提出一种基于乐观主义原则和线性混合模型的算法,并推导出一些后悔界的理论结果。
Abstract
This paper studies
model-based reinforcement learning
(RL) for regret minimization. We focus on
finite-horizon episodic rl
where the transition model $P$ belongs to a known family of models $\mathcal{P}$, a speci
→