Yitian Yuan, Xiaohan Lan, Long Chen, Wei Liu, Wenwu Zhu
TL;DR本文主要研究Temporal Sentence Grounding in Videos,在现有的评估协议中,重新组织两个广泛使用的TSGV基准及引入新的评估指标dR @ n,IoU @ m来校准基本的IoU分数,进一步监控TSGV的进展。
Abstract
Despite temporal sentence grounding in videos (TSGV) has realized impressive progress over the last few years, current TSGV models tend to capture the moment annotation biases and fail to take full advantage of multi-mo