BriefGPT.xyz
Feb, 2019
仅需 $\sqrt{T}$ 遗憾值即可高效学习线性-二次调节器
Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret
HTML
PDF
Alon Cohen, Tomer Koren, Yishay Mansour
TL;DR
我们提出了第一个计算效率高的算法,其在具有未知动态的线性二次控制系统中进行学习时仅有 $\widetilde O(\sqrt{T})$ 遗憾度。
Abstract
We present the first computationally-efficient algorithm with $\widetilde O(\sqrt{T})$
regret
for
learning
in
linear quadratic control systems
→