BriefGPT.xyz
Aug, 2020
随机镜像下降法高效求解MDPs
Efficiently Solving MDPs with Stochastic Mirror Descent
HTML
PDF
Yujia Jin, Aaron Sidford
TL;DR
通过基于原始-对偶随机镜像下降的统一框架,提供了一种近似求解具有生成模型的无限时域马尔可夫决策过程,同时提出了解决双线性鞍点问题与约束MDPs的方法。
Abstract
We present a unified framework based on primal-dual
stochastic mirror descent
for approximately solving infinite-horizon
markov decision processes
(MDPs) given a
→