BriefGPT.xyz
May, 2016
一种几乎具有最优伪遗憾的算法,适用于随机和对抗性贝叶斯赌博机
An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits
HTML
PDF
Peter Auer, Chao-Kai Chiang
TL;DR
本研究提出了一种算法,能够在对抗式和随机式赌徒问题中实现几乎最优的伪后悔界限,并表明任何在随机式赌徒问题中具有O(log n)伪后悔界限的算法都无法对自适应对抗式赌徒问题实现O(sqrt(n))的期望后悔。
Abstract
We present an algorithm that achieves almost optimal
pseudo-regret bounds
against adversarial and
stochastic bandits
. Against
adversarial bandits
→