value iteration is a powerful yet inefficient algorithm for Markov decision
processes (MDPs) because it puts the majority of its effort into backing up the
entire state space, which turns out to be unnecessary in many cases. In order
to overcome this problem, many approaches have been