鉴别好臂之真实样本复杂性

Jun, 2019

The True Sample Complexity of Identifying Good Arms

Julian Katz-Samuels, Kevin Jamieson

TL;DR提出多臂老虎机算法中两个问题：如何识别平均值与最大平均值相差小于给定阈值的武器和如何识别平均值大于给定阈值的k支武器。在此基础上，给出了形式化的定义，匹配了样本复杂度的下界，并提供了几乎匹配上界的具体实用算法。

Abstract

We consider two multi-armed bandit problems with $n$ arms: (i) given an $\epsilon > 0$, identify an arm with mean that is within $\epsilon$ of the largest mean and (ii) given a threshold $\mu_0$ and integer $k$, identify $k$ arms with means larger than $\mu_0$. Existing lower bounds an