BriefGPT.xyz
May, 2024
无遗憾并不足够!通过自适应遗憾最小化处理具有一般约束的赌博机
No-Regret is not enough! Bandits with General Constraints through Adaptive Regret Minimization
HTML
PDF
Martino Bernasconi, Matteo Castiglioni, Andrea Celli
TL;DR
通过要求原始算法和对偶算法是弱自适应的,我们证明了在「背包带劫匪」框架中,能够保证子线性的对违规约束的罚值,并同时在随机和对抗的情况下提供最佳性能,并为具有线性约束的上下文劫匪问题提供了首个无α-遗憾的保证。
Abstract
In the
bandits with knapsacks framework
(BwK) the learner has $m$ resource-consumption (packing) constraints. We focus on the generalization of BwK in which the learner has a set of
general long-term constraints
.
→