BriefGPT.xyz
Oct, 2018
统一随机和对抗性赌博机与背包问题
Unifying the stochastic and the adversarial Bandits with Knapsack
HTML
PDF
Anshuka Rangi, Massimo Franceschetti, Long Tran-Thanh
TL;DR
本文研究了在预算限制下的拟背包问题下应用 EXP3.BwK 算法解决对抗性赌徒问题,提出了在线学习方案并给出了相应的后悔界。研究表明,当动作成本与预算大小相当时,可实现的后悔界可能会极差,相比于成本受限的情况。
Abstract
This paper investigates the
adversarial bandits
with Knapsack (BwK)
online learning
problem, where a player repeatedly chooses to perform an action, pays the corresponding cost, and receives a reward associated w
→