Agents that can understand and reason over the dynamics of objects can have a
better capability to act robustly and generalize to novel scenarios. Such an
ability, however, requires a suitable representation of the scene as well as an
understanding of the mechanisms that govern the interactions of different
subsets of objects. To address this problem, we pro