Xutong Liu, Siwei Wang, Jinhang Zuo, Han Zhong, Xuchuang Wang...
TL;DR引入一种新的组合多臂赌博梳理 (CMAB) 框架,具有多维和概率触发的臂 (CMAB-MT),其中每个臂的结果是一个 d 维多维随机变量,反馈遵循普通臂触发过程。
Abstract
We introduce a novel framework of combinatorial multi-armed bandits (CMAB) with multivariant and probabilistically triggering arms (CMAB-MT), where the outcome of each arm is a $d$-dimensional multivariant random