BriefGPT.xyz
Apr, 2020
具备随机行动集和对抗性奖励的改进型睡眠赌博机
Improved Sleeping Bandits with Stochastic Actions Sets and Adversarial Rewards
HTML
PDF
Aadirupa Saha, Pierre Gaillard, Michael Valko
TL;DR
本文考虑了具有随机动作集和对抗回报的睡眠强盗问题,提出了一种新的受到EXP3启发的高效算法,并在每轮可用集合从一些未知的任意分布中产生的最普遍版本中提出了一个具有保证的高效算法。
Abstract
In this paper, we consider the problem of
sleeping bandits
with
stochastic action sets
and
adversarial rewards
. In this setting, in contra
→