The demand for collaborative and private bandit learning across multiple agents is surging due to the growing quantity of data generated from distributed systems. Federated bandit learning has emerged as a promising framework for private, efficient, and decentralized online learning. However, almost all previous works rely on strong assumptions of client homogeneity, i.e., all participating clients shall share the same bandit model; otherwise, they all would suffer linear regret. This greatly restricts the application of federated bandit learning in practice. In this work, we introduce a new approach for federated bandits for heterogeneous clients, which clusters clients for collaborative bandit learning under the federated learning setting. Our proposed algorithm achieves non-trivial sub-linear regret and communication cost for all clients, subject to the communication protocol under federated learning that at anytime only one model can be shared by the server.

提出了一种面向异构客户的联邦赌博学习算法，通过对客户进行聚类实现了协同赌博学习，在联邦学习设置下，该算法在所有客户端都能实现非平凡的次线性遗憾和通信成本，只要服务器在任何时候只共享一个模型。

异质客户的联邦线性情境赌博机