通过丰富视觉特征和视觉驱动的词向量提升多模态神经机器翻译

Jul, 2017

通过丰富视觉特征和视觉驱动的词向量提升多模态神经机器翻译

Visually Grounded Word Embeddings and Richer Visual Features for Improving Multimodal Neural Machine Translation

Jean-Benoit Delbrouck, Stéphane Dupont, Omar Seddati

TL;DR本文探讨了在多模态神经机器翻译(MNMT)中使用密集标注模型进行视觉特征提取和词嵌入，以提高图像描述翻译模型的效果。

Abstract

In multimodal neural machine translation (MNMT), a neural model generates a translated sentence that describes an image, given the image itself and one source descriptions in English. This is considered as the multimodal →