BriefGPT.xyz
Feb, 2020
对抗性在线控制的对数遗憾
Logarithmic Regret for Adversarial Online Control
HTML
PDF
Dylan J. Foster, Max Simchowitz
TL;DR
本文针对已知系统且受到敌对扰动的情况下,介绍了新的在线线性二次控制算法,通过将在线控制问题转化为具有近似优越函数的(延迟的)在线学习,无需控制迭代的运动成本,从而提高了算法的效果。
Abstract
We introduce a new algorithm for
online linear-quadratic control
in a known system subject to
adversarial disturbances
. Existing regret bounds for this setting scale as $\sqrt{T}$ unless strong stochastic assumpt
→