多模态语言模型中的当前挑战的视觉导览

Oct, 2022

多模态语言模型中的当前挑战的视觉导览

A Visual Tour Of Current Challenges In Multimodal Language Models

Shashank Sonkar, Naiming Liu, Richard G. Baraniuk

TL;DR本研究探索了使用多模型文本-图像生成来实现视觉绑定对功能词汇习得的帮助程度，并发现多模型仅在极少数的代词子类和关系代词方面有效地建模功能词汇。

Abstract

transformer models trained on massive text corpora have become the de facto models for a wide range of natural language processing tasks. However, learning effective word representations for →