BriefGPT.xyz
Jan, 2024
多模态任务的动态Transformer架构
Dynamic Transformer Architecture for Continual Learning of Multimodal Tasks
HTML
PDF
Yuliang Cai, Mohammad Rostami
TL;DR
我们提出了一种基于Transformer的持续学习框架TAM-CL,用于学习涉及视觉和语言的多模态任务,并通过引入额外参数和知识蒸馏实现任务间的信息交流,以及解决灾难性遗忘问题。该方法在多种挑战性的多模态任务上达到了最先进的性能。
Abstract
transformer neural networks
are increasingly replacing prior architectures in a wide range of applications in different data modalities. The increasing size and computational demands of fine-tuning large pre-trained
tra
→