BriefGPT.xyz
Aug, 2020
UCB赌博机上的近最优对抗攻击
Near Optimal Adversarial Attack on UCB Bandits
HTML
PDF
Shiliang Zuo
TL;DR
我们提出了一种新的攻击策略,在随机多臂赌博问题中,通过操纵UCB原则来引导其选择一些次优的目标臂,攻击成本的累计代价随轮数的增加而增长,上界与下界相差一个loglogT的因子,因此我们的攻击接近最优。
Abstract
We consider a
stochastic multi-arm bandit problem
where rewards are subject to
adversarial corruption
. We propose a novel
attack strategy
→