BriefGPT.xyz
Apr, 2018
基于专家预测的无模型线性二次控制
Regret Bounds for Model-Free Linear Quadratic Control
HTML
PDF
Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvari
TL;DR
本文介绍了一种新的无模型算法,用于控制线性二次系统,利用reduce方法,将马尔科夫决策过程的控制问题转化为专家预测问题,该算法实现简单通用,拥有多项理论保证和良好的性能。
Abstract
Model-free approaches for
reinforcement learning
(RL) and
continuous control
find policies based only on past states and rewards, without fitting a model of the system dynamics. They are appealing as they are gen
→