robust markov decision processes (RMDPs) intend to ensure robustness with
respect to changing or adversarial system behavior. In this framework,
transitions are modeled as arbitrary elements of a known and properly
structured uncertainty set and a robust optimal policy can be derived u