Modeling instance-level context and object-object relationships is extremely
challenging. It requires reasoning about bounding boxes of different classes,
locations \etc. Above all, instance-level spatial reasoning inherently requires
modeling conditional distributions on previous dete