BriefGPT.xyz
May, 2016
阈值赌博机问题的最优算法
An optimal algorithm for the Thresholding Bandit Problem
HTML
PDF
Andrea Locatelli, Maurilio Gutzeit, Alexandra Carpentier
TL;DR
本文提出一种基于启发式算法的无参数算法,用于解决特定的组合纯探索随机赌博机问题,以寻找一组平均值高于给定阈值的摇臂,满足给定精度和一定的时间限制,并证明该算法是情况下的最优解决方案,并提供了相应的上下界。本文是首个针对纯探索设置的固定预算问题,并构建了最优策略。
Abstract
We study a specific \textit{
combinatorial pure exploration
stochastic bandit problem
} where the learner aims at finding the set of arms whose means are above a given threshold, up to a given precision, and \texti
→