BriefGPT.xyz
Aug, 2016
使用网络图像搜索学习视频和句子的联合表示
Learning Joint Representations of Videos and Sentences with Web Image Search
HTML
PDF
Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Naokazu Yokoya
TL;DR
该研究旨在基于自然语言查询进行视频检索,并采用嵌入模型进行检索任务的训练,试图通过图像搜索以及嵌入模型的应用使 fine-grained 视觉概念得到消歧,最终在视频和句子检索任务中实现了明显的改进,并取得了与当前最先进技术相媲美的描述生成性能。
Abstract
Our objective is
video retrieval
based on
natural language queries
. In addition, we consider the analogous problem of retrieving sentences or generating descriptions given an input video. Recent work has addresse
→