We study a novel multi-armed bandit problem that models the challenge faced
by a company wishing to explore new strategies to maximize revenue whilst
simultaneously maintaining their revenue above a fixed baseline, uniformly over
time. While previous work addressed the problem under th