weakly supervised object detection has recently received much attention,
since it only requires image-level labels instead of the bounding-box labels
consumed in strongly supervised learning. Nevertheless, the save in labeling
expense is usually at the cost of model accuracy. In this p