BriefGPT.xyz
Jan, 2021
线性马尔可夫决策过程低切换成本可证效率算法
A Provably Efficient Algorithm for Linear Markov Decision Process with Low Switching Cost
HTML
PDF
Minbo Gao, Tianle Xie, Simon S. Du, Lin F. Yang
TL;DR
本文着重于线性马尔可夫决策过程(MDP)问题中的低转换成本,并提出了第一个具有低转换成本的线性MDP算法,同时通过低转换成本较小而达到了大体积的泛化。
Abstract
Many real-world applications, such as those in medical domains, recommendation systems, etc, can be formulated as large state space
reinforcement learning
problems with only a small budget of the number of policy changes, i.e., low
→