BriefGPT.xyz
Feb, 2018
连续视频流中的事件检测和描述
Joint Event Detection and Description in Continuous Video Streams
HTML
PDF
Huijuan Xu, Boyang Li, Vasili Ramanishka, Leonid Sigal, Kate Saenko
TL;DR
JEDDi-Net是一种用于密集视频字幕生成的神经网络,它通过三维卷积层对输入视频流进行连续编码,并使用时间池化特征提出可变长度的时间事件,再生成它们的字幕。在大规模数据集上,JEDDi-Net 表现出了优异的性能。
Abstract
As a fine-grained video understanding task,
dense video captioning
involves first localizing events in a video and then generating captions for the identified events. We present the Joint
event detection
and Desc
→