带有强盗反馈的随机凸优化

Jul, 2011

Stochastic convex optimization with bandit feedback

Alekh Agarwal, Dean P. Foster, Daniel Hsu, Sham M. Kakade, Alexander Rakhlin

TL;DR本文提出了一种基于随机赌博反馈模型的新型优化算法，采用椭球算法的泛化形式，对凸紧致集上的凸利普希茨（Lipschitz）函数最小化问题进行求解，证明其性能在满足一定条件下与时间步数T为O（d^3/2）同阶，并获得了泛化性能的高阶乘性加速，表现出良好的应用前景和性能优势。

Abstract

This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $\xset$ under a stochastic bandit feedback model. In this model, the algorithm is allowed to observed noisy realizations of the function value $f(x)$ at any query point $x \in \xs