BriefGPT.xyz
Jul, 2024
基于注意力驱动的约束平衡的视觉定位
Visual Grounding with Attention-Driven Constraint Balancing
HTML
PDF
Weitai Kang, Luowei Zhou, Junyi Wu, Changchang Sun, Yan Yan
TL;DR
本文介绍了一种名为AttBalance的新框架,通过优化语言相关区域内的视觉特征行为,以提高视觉定位任务的性能,并在四个不同基准测试上对五种不同模型进行了评估和持续改进,进而在QRNet上实现了最新的表现水平。
Abstract
Unlike Object Detection,
visual grounding
task necessitates the detection of an object described by complex free-form language. To simultaneously model such complex semantic and visual representations, recent state-of-the-art studies adopt
→