BriefGPT.xyz
Jul, 2023
视觉语言变换器:一项调研
Vision Language Transformers: A Survey
HTML
PDF
Clayton Fields, Casey Kennington
TL;DR
视觉语言任务中,基于预训练的变压器架构在视觉语言建模方面表现出色,为视觉和语言结合的任务带来了类似的进展。
Abstract
vision language tasks
, such as answering questions about or generating captions that describe an image, are difficult tasks for computers to perform. A relatively recent body of research has adapted the
pretrained trans
→