线性上下文多臂赌博机和背包问题

Jul, 2015

线性上下文多臂赌博机和背包问题

Linear Contextual Bandits with Global Constraints and Objective

Shipra Agrawal, Nikhil R. Devanur

TL;DR本文研究了带有资源消耗的线性情境赌博机问题，算法具有近乎最优的遗憾界，并将技术从Solution综述中的线性情境赌博机，背包赌博机和在线随机填充问题中结合使用。

Abstract

We consider the linear contextual bandit problem with global convex constraints and a concave objective function. In each round, the outcome of pulling an arm is a vector, that depends linearly on the context of that arm. The global constraints require the average of these vectors to l