TL;DR本文提出 Elastic Moment Bounding (EMB) 和 guided attention 机制,解决了视频活动定位训练时时间标注的不确定性问题,提高了在自然视频中的准确性和鲁棒性。
Abstract
Current methods for video activity localisation over time assume implicitly
that activity temporal boundaries labelled for model training are determined
and precise. However, in unscripted natural videos, differe