BriefGPT.xyz
Jul, 2022
上下文赌博机的最优PAC算法
Instance-optimal PAC Algorithms for Contextual Bandits
HTML
PDF
Zhaoqi Li, Lillian Ratliff, Houssam Nassif, Kevin Jamieson, Lalit Jain
TL;DR
本文研究了 $(\epsilon,\delta)-\textit{PAC}$ 场景下的随机赌博机问题,给出了上下界,并提供了一个新的基于 argmax Oracle 的实例最优和计算效率高的算法。
Abstract
In the
stochastic contextual bandit
setting, regret-minimizing algorithms have been extensively researched, but their instance-minimizing
best-arm identification
counterparts remain seldom studied. In this work,
→