通过生成模型改进文本-视觉交叉检索：观察、想象和匹配

Nov, 2017

通过生成模型改进文本-视觉交叉检索：观察、想象和匹配

Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models

Jiuxiang Gu, Jianfei Cai, Shafiq Joty, Li Niu, Gang Wang

TL;DR本文提出一种新的跨模态检索方法，利用生成式模型学习多模态数据的全局和本地特征，从而在MSCOCO数据集上实现了最先进的跨模态检索结果。

Abstract

textual-visual cross-modal retrieval has been a hot research topic in both computer vision and natural language processing communities. Learning appropriate representations for multi-modal data is crucial for the