This paper presents a fully convolutional scene graph generation (FCSGG)
model that detects objects and relations simultaneously. Most of the scene
graph generation frameworks use a pre-trained two-stage object detector, like
Faster R-CNN, and build scene graphs using bounding box feat