BriefGPT.xyz
May, 2023
基于深度Koopman表达的策略学习
Policy Learning based on Deep Koopman Representation
HTML
PDF
Wenjian Hao, Paulo C. Heredia, Bowen Huang, Zehui Lu, Zihao Liang...
TL;DR
本文提出了一种基于Koopman算子理论和策略梯度方法的政策学习算法,该算法将未知动态系统的线性逼近和最优政策搜索相结合,引入所谓的深度Koopman表示来提高数据效率,并应用贝尔曼最优原理来避免逼近系统动态引起的长期任务的累积误差,同时提供理论分析以证明所提出算法的渐近收敛性和采样复杂度。
Abstract
This paper proposes a
policy learning
algorithm based on the
koopman operator theory
and
policy gradient approach
, which seeks to approxim
→