BriefGPT.xyz
Nov, 2019
低秩 HOCA:视频字幕的高效高阶跨模态注意力
Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video Captioning
HTML
PDF
Tao Jin, Siyu Huang, Yingming Li, Zhongfei Zhang
TL;DR
介绍了一种基于高阶跨模态关注机制的视频字幕生成模型,通过各模态之间的交互关系计算注意力权重,引入低秩张量分解实现高效实现,实验结果表明新的模型在两种基准数据集上取得了最佳效果。
Abstract
This paper addresses the challenging task of
video captioning
which aims to generate descriptions for video data. Recently, the
attention-based encoder-decoder structures
have been widely used in
→