BriefGPT.xyz
Oct, 2021
异步上置信区间算法用于联邦线性赌臂机
Asynchronous Upper Confidence Bound Algorithms for Federated Linear Bandits
HTML
PDF
Chuanhao Li, Hongning Wang
TL;DR
本文旨在探索线性上下文强化学习在联邦学习环境下的应用,提出了一种基于异步模型更新和通信的通用框架,并对分布式学习下的遗憾和通信成本进行了理论分析,并进行了广泛的实证评估,证明了该解决方案的有效性。
Abstract
linear contextual bandit
is a popular online learning problem. It has been mostly studied in centralized learning settings. With the surging demand of large-scale decentralized model learning, e.g.,
federated learning
→