Fabio Massimo Zennaro, Nicholas Bishop, Joel Dyer, Yorgos Felekis, Anisoara Calinescu...
TL;DR将传输学习应用于因果抽象多臂赌博机,研究算法学习和后悔度,以解决在线广告相关的现实场景。
Abstract
multi-armed bandits (MAB) and causal mabs (CMAB) are established frameworks
for decision-making problems. The majority of prior work typically studies and
solves individual MAB and CMAB in isolation for a given p