constrained markov decision processes (CMDPs) model scenarios of sequential decision making with multiple objectives that are increasingly important in many applications. However, the model is often unknown and must be learned online while still ensuring the constraint is met, or at le