BriefGPT.xyz
Oct, 2011
带有协变量的多臂老虎机问题
The multi-armed bandit problem with covariates
HTML
PDF
Vianney Perchet, Philippe Rigollet
TL;DR
本研究提出了一种新的策略 abse 用于动态多臂赌博问题中,其可以将全局问题自适应地拆分为静态多臂赌博问题,同时其在静态多臂赌博问题中的后继消除策略的遗憾界更为严格,且在动态问题中其拥有最小极小遗憾界。
Abstract
We consider a
multi-armed bandit problem
in a setting where each arm produces a
noisy reward realization
which depends on an observable
random co
→