Jan, 2024

DiffAugment:基于扩散模型的长尾视觉关系识别

TL;DRVisual Relationship Recognition (VRR) using DiffAugment and Diffusion Models to address the imbalanced distribution of triplets, introducing a hardness-aware component and a subject/object-based seeding strategy, improving per-class accuracy on the GQA-LT dataset.