Xuchuang Wang, Qingyun Wu, Wei Chen, John C.S. Lui
TL;DR研究了多精度多臂赌博机(MF-MAB)及其最优臂识别和后悔最小化目标,为 BAI 提出了成本复杂度下限,推荐两种替代忠诚度选择程序的算法框架,并确定了两种程序的成本复杂度上限,并提出了新的后悔定义,以及解决了该问题的消除算法。
Abstract
We study the multi-fidelity multi-armed bandit (MF-MAB), an extension of the
canonical multi-armed bandit (MAB) problem. MF-MAB allows each arm to be pulled
with different costs (fidelities) and observation accuracy. We study both the
best arm identification with fixed confidence (BAI)