BriefGPT.xyz
Feb, 2016
非随机赌博机中的延迟与合作
Delay and Cooperation in Nonstochastic Bandits
HTML
PDF
Nicolo' Cesa-Bianchi, Claudio Gentile, Yishay Mansour, Alberto Minora
TL;DR
研究了协作解决普通非随机赌博问题的学习代理通信网络,介绍了 extsc{Exp3-Coop}算法并证明了该算法的最大后悔度界限。
Abstract
We study
networks
of communicating
learning agents
that cooperate to solve a common nonstochastic
bandit problem
. Agents use an underlying
→