BriefGPT.xyz
Mar, 2016
赌博机凸优化问题的最优算法
An optimal algorithm for bandit convex optimization
HTML
PDF
Elad Hazan, Yuanzhi Li
TL;DR
本文针对带有随机反馈的在线凸优化问题(称为bandit convex optimization),通过将椭球法应用于在线学习,给出了第一个$\tilde{O}(\sqrt{T})$-regret算法,并引入了离散凸几何中的新工具。
Abstract
We consider the problem of
online convex optimization
against an arbitrary adversary with
bandit feedback
, known as bandit convex optimization. We give the first $\tilde{O}(\sqrt{T})$-
→