We propose empirical dynamic programming algorithms for Markov decision
processes (MDPs). In these algorithms, the exact expectation in the Bellman
operator in classical value iteration is replaced by an empirical estimate to
get `empirical value iteration' (EVI). Policy evaluation and