BriefGPT.xyz
Apr, 2019
端到端视频字幕生成
An End-to-End Baseline for Video Captioning
HTML
PDF
Silvio Olivastri, Gurkirt Singh, Fabio Cuzzolin
TL;DR
本文提出了采用端到端训练的方法来实现视频描述生成,并在微软研究视频描述数据集 (MSVD) 和微软视频到文本数据集 (MSR-VTT) 上取得了最新的最优表现。
Abstract
Building correspondences across different modalities, such as video and language, has recently become critical in many visual recognition applications, such as
video captioning
. Inspired by
machine translation
, r
→