We present an approach to learn an object-centric forward model, and show
that this allows us to plan for sequences of actions to achieve distant desired
goals. We propose to model a scene as a collection of objects, each with an
explicit spatial location and implicit visual feature, a