利用不确定性的结构实现高效Matroid半Bandits

Feb, 2019

利用不确定性的结构实现高效Matroid半Bandits

Exploiting Structure of Uncertainty for Efficient Combinatorial Semi-Bandits

Pierre Perrault, Vianney Perchet, Michal Valko

TL;DR本研究通过将实现优化为特定的子模最大化，并设计适应的近似程序，提供了首个可以依赖奖励结构来改善遗憾界限的有效算法。这一改进将状态-of-the-art的无间隙遗憾界限显著提高了sqrt(m)/log m倍。最后，我们证明了我们的改进如何转化为更普遍的预算组合半强盗。

Abstract

We improve the efficiency of algorithms for stochastic \emph{combinatorial semi-bandits}. In most interesting problems, state-of-the-art algorithms take advantage of structural properties of rewards, such as \emph{independence}. However, while being minimax optimal in terms of regret,