BriefGPT.xyz
Jan, 2023
语言理解通用多模态表示
Universal Multimodal Representation for Language Understanding
HTML
PDF
Zhuosheng Zhang, Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita...
TL;DR
本文提出了一种新方法来将视觉信息作为协助信号用于NLP任务,使用Transformer编码器和卷积神经网络来对文本和图像进行编码,通过注意力层将两种模态的表征进行融合,实验结果表明,该方法在不同的任务和语言中都具有很好的效果。
Abstract
representation learning
is the foundation of
natural language processing
(NLP). This work presents new methods to employ visual information as assistant signals to general NLP tasks. For each sentence, we first r
→