BriefGPT.xyz
Oct, 2020
基于模型的自我对弈强化学习的严密分析
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
HTML
PDF
Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin
TL;DR
本文针对多智能体马尔科夫博弈提出了一种基于模型的算法Nash-VI,在理论上证明其具有较高的样本利用率,并且在实验中证明了其优于现有的基于模型的方法和一些基于无模型的算法,输出单个Markov策略且易于存储和执行。
Abstract
model-based algorithms
---algorithms that decouple learning of the model and planning given the model---are widely used in reinforcement learning practice and theoretically shown to achieve optimal
sample efficiency
→