BriefGPT.xyz
May, 2021
使用线性函数逼近的随机最短路径问题的遗憾界限
Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation
HTML
PDF
Daniel Vial, Advait Parulekar, Sanjay Shakkottai, R. Srikant
TL;DR
该研究提出了一种使用线性函数逼近算法的随机最短路径问题的算法,具有次线性regret、计算效率高、使用平稳策略等特点,是该领域内第一种此类算法。
Abstract
We propose two algorithms for episodic
stochastic shortest path
problems with
linear function approximation
. The first is computationally expensive but provably obtains $\tilde{O} (\sqrt{B_\star^3 d^3 K/c_{min}}
→