从字幕到视觉概念的转换和回归

Nov, 2014

From Captions to Visual Concepts and Back

Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng...

TL;DR本文提出了一种用于自动生成图像描述的新方法：使用从图像标题数据集中直接学习的视觉探测器、语言模型和多模式相似模型。

Abstract

This paper presents a novel approach for automatically generating image descriptions: visual detectors and language models learn directly