large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of triples. In real-world scenarios with large numbers of objects and relations, some are seen very commonly while others are bare