BriefGPT.xyz
Dec, 2023
固定预算下的最优臂识别:大偏差视角
Best Arm Identification with Fixed Budget: A Large Deviation Perspective
HTML
PDF
Po-An Wang, Ruo-Chun Tzeng, Alexandre Proutiere
TL;DR
通过大偏差原理,我们在适应性算法中建立了样本抽取比例与样本奖励之间的联系,从而改进了现有算法并设计了新算法,我们证明了新算法的性能优于现有算法,包括对众多抽样的广泛实验证实了这一观察结果。
Abstract
We consider the problem of identifying the best arm in
stochastic multi-armed bandits
(MABs) using a fixed
sampling budget
. Characterizing the minimal instance-specific error probability for this problem constitu
→