multi-armed bandit (MAB) systems are witnessing an upswing in applications
within multi-agent distributed environments, leading to the advancement of
collaborative MAB algorithms. In such settings, communication between agents
executing actions and the primary learner making decisions