BriefGPT.xyz
Sep, 2021
基于集群武器的汤普森抽样算法
Thompson Sampling for Bandits with Clustered Arms
HTML
PDF
Emil Carlsson, Devdatt Dubhashi, Fredrik D. Johansson
TL;DR
该论文提出了基于多级 Thompson 抽样方案的算法,用于解决具有线性预期收益的上下文相关多臂赌博机及其聚类武器的问题。同时,理论和实证表明,利用特定的集群结构可以显著改善遗憾并降低计算成本。
Abstract
We propose algorithms based on a multi-level
thompson sampling
scheme, for the stochastic
multi-armed bandit
and its
contextual variant
wi
→