视觉指令调整

Apr, 2023

Visual Instruction Tuning

Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee

TL;DR本文利用语言模型GPT-4生成多模态图文指令序列来优化多模态模型，得到了新的模型LLaVA并在多个数据集上表现出色。

Abstract

instruction tuning large language models (llms) using machine-generated instruction-following data has improved zero-shot capabilities on new tasks, but the idea is less explored in the →