基于集群武器的汤普森抽样算法

Sep, 2021

Thompson Sampling for Bandits with Clustered Arms

Emil Carlsson, Devdatt Dubhashi, Fredrik D. Johansson

TL;DR该论文提出了基于多级 Thompson 抽样方案的算法，用于解决具有线性预期收益的上下文相关多臂赌博机及其聚类武器的问题。同时，理论和实证表明，利用特定的集群结构可以显著改善遗憾并降低计算成本。

Abstract

We propose algorithms based on a multi-level thompson sampling scheme, for the stochastic multi-armed bandit and its contextual variant wi