TL;DR本文提出了一个简单而计算高效的算法,能够在多项对数轮内获得ε T-swap遗憾,这在与现有算法相比的超线性轮次要求下,是一种指数级的改进,并解决了“Blum and Mansour 2007”中的主要未解决问题。同时该算法对ε有指数级依赖,但我们证明了一个新的,相匹配的下界。
Abstract
We give a simple and computationally efficient algorithm that, for any constant $\varepsilon>0$, obtains $\varepsilon T$-swap regret within only $T = \mathsf{polylog}(n)$ rounds; this is an exponential improvemen