BriefGPT.xyz
Sep, 2016
多臂赌博机中最优臂选取的顺序消除算法
On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits
HTML
PDF
Shahin Shahrampour, Mohammad Noshad, Vahid Tarokh
TL;DR
研究了多臂赌博机中的最佳臂辨识问题,提出了一个基于顺序淘汰算法的通用框架,并基于采样机制和每轮淘汰臂数量提出了性能评估指标,设计了一种按剩余臂数的非线性函数划分预算的算法,能够在纯探索场景下获得改进的理论保证和实验性能。
Abstract
We consider the best-arm identification problem in
multi-armed bandits
, which focuses purely on
exploration
. A player is given a fixed budget to explore a finite set of arms, and the rewards of each arm are drawn
→