仅需 $\sqrt{T}$ 遗憾值即可高效学习线性-二次调节器

Feb, 2019

Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret

Alon Cohen, Tomer Koren, Yishay Mansour

TL;DR我们提出了第一个计算效率高的算法，其在具有未知动态的线性二次控制系统中进行学习时仅有 $\widetilde O(\sqrt{T})$ 遗憾度。

Abstract

We present the first computationally-efficient algorithm with $\widetilde O(\sqrt{T})$ regret for learning in linear quadratic control systems