BriefGPT.xyz
Apr, 2024
学习纠正:零样本生成视觉-语言推理的高效调节任务
Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning
HTML
PDF
Rongjie Li, Yu Wu, Xuming He
TL;DR
通过Image-Conditioned Caption Correction(ICCC)指导的二次调整,提高图像与语言之间的零-shot推理性能。
Abstract
generative vision-language models
(VLMs) have shown impressive performance in zero-shot vision-language tasks like image captioning and visual question answering. However, improving their
zero-shot reasoning
typi
→