BriefGPT.xyz
Sep, 2024
具有恒定步长的烦躁多臂老虎机的惠特尔指数学习算法
Whittle Index Learning Algorithms for Restless Bandits with Constant Stepsizes
HTML
PDF
Vishesh Mittal, Rahul Meshram, Surya Prakash
TL;DR
本文研究了用于烦躁多臂老虎机的惠特尔指数学习算法,填补了现有研究在该领域的空白。作者提出了一种结合Q学习与探索策略的算法,并分析了其在恒定步长下的性能。实验证明,该算法可以有效学习惠特尔指数,具有广泛的应用潜力。
Abstract
We study the
Whittle Index
learning algorithm for restless multi-armed bandits. We consider index learning algorithm with
Q-learning
. We first present
→