在线非线性控制的信息论遗憾界

Jun, 2020

Information Theoretic Regret Bounds for Online Nonlinear Control

Sham Kakade, Akshay Krishnamurthy, Kendall Lowrey, Motoya Ohnishi, Wen Sun

TL;DR该研究针对未知的非线性动态系统问题，提出了一种基于再生核希尔伯特空间的顺序控制算法，并通过信息理论量来获得近乎最优的遗憾上界，实验结果表明其在多个非线性控制任务中均获得了较好的表现。

Abstract

This work studies the problem of sequential control in an unknown, nonlinear dynamical system, where we model the underlying system dynamics as an unknown function in a known reproducing kernel hilbert space. Thi