BriefGPT.xyz
Feb, 2019
视频字幕的时空动态与语义属性增强视觉编码
Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning
HTML
PDF
Nayyer Aafaq, Naveed Akhtar, Wei Liu, Syed Zulqarnain Gilani, Ajmal Mian
TL;DR
本篇文章提出了一种视觉特征编码技术,使用门控循环单元(GRUs)生成语义丰富的视频字幕,并在MSVD和MSR-VTT数据集上创造了新的METEOR和ROUGE_L度量标准的最新技术水平。
Abstract
Automatic generation of video captions is a fundamental challenge in computer vision. Recent techniques typically employ a combination of Convolutional Neural Networks (
cnns
) and Recursive Neural Networks (
rnns
)
→