BriefGPT.xyz
Jul, 2021
从展示到描述:深度学习图像字幕化综述
From Show to Tell: A Survey on Image Captioning
HTML
PDF
Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Silvia Cascianelli, Giuseppe Fiameni...
TL;DR
本文综合研究图像描述中的视觉编码、文本生成、训练策略、数据集和评估指标等方面,量化比较多个相关的最前沿方法,以识别体系结构和训练策略中最具影响力的技术创新并探讨问题的许多变体和开放挑战,旨在为理解现有文献和强调计算机视觉和自然语言处理领域的未来方向提供工具。
Abstract
connecting vision and language
plays an essential role in
generative intelligence
. For this reason, in the last few years, a large research effort has been devoted to
→