BriefGPT.xyz
Jan, 2018
更多适应性算法用于对抗式赌博机
More Adaptive Algorithms for Adversarial Bandits
HTML
PDF
Chen-Yu Wei, Haipeng Luo
TL;DR
提出了一种新颖的算法,采用乐观性和适应性技术,结合在线镜像下降框架和特殊的对数障碍正则化器来解决对抗性多臂赌博机问题和组合半赌博问题,并在提高先前工作的同时,取得了多种新的数据依赖性遗憾界。
Abstract
We develop a novel and generic algorithm for the
adversarial multi-armed bandit problem
(or more generally the combinatorial semi-bandit problem). When instantiated differently, our algorithm achieves various new data-dependent
→