对抗性在线控制的对数遗憾

Feb, 2020

Logarithmic Regret for Adversarial Online Control

Dylan J. Foster, Max Simchowitz

TL;DR本文针对已知系统且受到敌对扰动的情况下，介绍了新的在线线性二次控制算法，通过将在线控制问题转化为具有近似优越函数的（延迟的）在线学习，无需控制迭代的运动成本，从而提高了算法的效果。

Abstract

We introduce a new algorithm for online linear-quadratic control in a known system subject to adversarial disturbances. Existing regret bounds for this setting scale as $\sqrt{T}$ unless strong stochastic assumpt