Reasoning about the future -- understanding how decisions in the present time
affect outcomes in the future -- is one of the central challenges for
reinforcement learning (RL), especially in highly-stochastic or partially
observable environments. While predicting the future directly is