Many sequential decision problems can be formulated as Markov Decision
Processes (MDPs) where the optimal value function (or cost-to-go function) can
be shown to satisfy a monotone structure in some or all of its dimensions. When
the state space becomes large, traditional techniques, such as the backward
dynamic programming algorithm (i.e., backward inductio