In this paper we propose a multi-armed bandit inspired, pool based active learning algorithm for the problem of binary classification. By carefully constructing an analogy between active learning and multi-armed bandits, we utilize ideas such as lower confidence bounds, and self-concordant regularization from the multi-armed bandit literature to design our proposed algorithm. Our algorithm is a sequential algorithm, which in each round assigns a sampling distribution on the pool, samples one point from this distribution, and queries the oracle for the label of this sampled point. The design of this sampling distribution is also inspired by the analogy between active learning and multi-armed bandits. We show how to derive lower confidence bounds required by our algorithm. Experimental comparisons to previously proposed active learning algorithms show superior performance on some standard UCI datasets.

本文提出了一个受多臂老虎机启发的池化主动学习算法，通过精心构造主动学习和多臂老虎机之间的类比，利用多臂老虎机文献中的下限置信度和自协调正则化等思想来设计我们提出的算法。我们的算法是一个顺序算法，在每一轮中，将池上的一个采样分布分配在上面，从这个分布中采样一个点，并查询标签。我们的采样分布的设计也受到了主动学习和多臂老虎机之间类比的启发。我们展示了如何推导算法所需的下限置信度。实验比较表明，在一些标准的UCI数据集上，与先前提出的主动学习算法相比，我们的算法具有更好的性能。

从多臂老虎机的角度看主动学习