高效学习线性二次调节器的对数损失

Feb, 2020

高效学习线性二次调节器的对数损失

Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently

Asaf Cassel, Alon Cohen, Tomer Koren

TL;DR本文介绍了Linear Quadratic Control系统的学习问题和非常高效的算法，算法的遗憾只随着决策步数的对数级别增加，并且当某些特定条件成立时可以得到更好的结果，但当条件不成立时，无法避免遗憾增长的平方根级别。

Abstract

We consider the problem of learning in linear quadratic control systems whose transition parameters are initially unknown. Recent results in this setting have demonstrated efficient learning algorithms with