BriefGPT.xyz
Aug, 2023
异步分散式 Q-Learning: 持久性的两时间尺度分析
Asynchronous Decentralized Q-Learning: Two Timescale Analysis By Persistence
HTML
PDF
Bora Yongacoglu, Gürdal Arslan, Serdar Yüksel
TL;DR
这篇研究论文探讨了多智能体强化学习中的非静态挑战,介绍了一种异步变种的分散式 Q 学习算法,并提供了使异步算法以高概率驱动到均衡的充分条件。它还将该算法及其相关方法的适用性扩展到参数独立选择的环境,并在不强加协调假设的情况下驯服了非静态挑战。
Abstract
non-stationarity
is a fundamental challenge in
multi-agent reinforcement learning
(MARL), where agents update their behaviour as they learn. Many theoretical advances in MARL avoid the challenge of
→