有限互动的协同学习：多臂赌博机分布式探索的严谨界限

Apr, 2019

有限互动的协同学习：多臂赌博机分布式探索的严谨界限

Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits

Chao Tao, Qin Zhang, Yuan Zhou

TL;DR研究多臂老虎机中多智能体协作学习，旨在探讨在交互受限制即沟通成本高昂的情况下，协作学习的效率与集中式算法的比较，提出多个新技术，对时间或置信度限制下的通信步数的下限问题进行了更加深入的分析。

Abstract

Best arm identification (or, pure exploration) in multi-armed bandits is a fundamental problem in machine learning. In this paper we study the distributed version of this problem where we have multiple agents, and they want to learn the best arm collaboratively. We want to quantify the