BriefGPT.xyz
May, 2021
多臂赌博机连续奖励博弈中的平均场均衡
Mean Field Equilibrium in Multi-Armed Bandit Game with Continuous Reward
HTML
PDF
Xiong Wang, Riheng Jia
TL;DR
研究用连续奖励函数的均场自博弈,重点在于推导出均场平衡的存在和唯一性,并通过广泛的评估结果验证了MAB问题的实证后悔紧致性。
Abstract
mean field game
facilitates analyzing
multi-armed bandit
(MAB) for a large number of agents by approximating their interactions with an average effect. Existing mean field models for multi-agent MAB mostly assume
→