TL;DR本篇论文提出一种基于系统内在低秩结构进行高效学习的算法,使样本复杂度只依赖于秩而非环境维度,同时获得了关于 K 的次线性复杂度,在 LQR 问题的应用中取得了较好效果。
Abstract
linear quadratic regulators (LQR) achieve enormous successful real-world
applications. Very recently, people have been focusing on efficient learning
algorithms for LQRs when their dynamics are unknown. Existing results
effectively learn to control the unknown system using number of ep