BriefGPT.xyz
Jun, 2016
多臂赌博机中的分布式协作决策:频率学派和贝叶斯算法
Distributed Cooperative Decision-Making in Multiarmed Bandits: Frequentist and Bayesian Algorithms
HTML
PDF
Peter Landgren, Vaibhav Srivastava, Naomi Ehrich Leonard
TL;DR
本研究使用频率学派和贝叶斯算法以及运行协商算法解决多智能体多臂赌博机问题中的探索和开发的分布式合作决策问题,并证明了这些算法的性能,以及通信图结构对决策性能的影响。
Abstract
We study
distributed cooperative decision-making
under the explore-exploit tradeoff in the multiarmed bandit (MAB) problem. We extend the state-of-the-art
frequentist and bayesian algorithms
for single-agent MAB
→