BriefGPT.xyz
Nov, 2023
随机赌博中的滑动遗憾: 辨别指数与随机策略
The Sliding Regret in Stochastic Bandits: Discriminating Index and Randomized Policies
HTML
PDF
Victor Boone
TL;DR
研究单次行为的无悔算法在随机赌博机中的应用,介绍滑动遗憾的概念,并证明随机方法具有最佳的滑动遗憾,而指数策略在索引条件下具有最差的滑动遗憾。
Abstract
This paper studies the one-shot behavior of
no-regret algorithms
for
stochastic bandits
. Although many algorithms are known to be asymptotically optimal with respect to the expected regret, over a single run, the
→