BriefGPT.xyz
Mar, 2023
eP-ALM: 语言模型的高效感知增强
eP-ALM: Efficient Perceptual Augmentation of Language Models
HTML
PDF
Mustafa Shukor, Corentin Dancette, Matthieu Cord
TL;DR
本文提出了一种高效适应单模预训练模型解决多模任务的方法eP-ALM,在冻结大多数参数、仅训练一个线性投影层,前置仅一个可训练标记的情况下,显著优于基线,并在图像、视频和音频模态下跨越VQA和字幕的多个基准测试中取得了最佳性能。
Abstract
large language models
(LLMs) have so far impressed the world, with unprecedented capabilities that emerge in models at large scales. On the vision side,
transformer models
(i.e., ViT) are following the same trend
→