BriefGPT.xyz
Nov, 2023
GPT-4V作为视觉语言任务的通用评估器
GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks
HTML
PDF
Xinlu Zhang, Yujie Lu, Weizhi Wang, An Yan, Jun Yan...
TL;DR
GPT-4V在多模态任务的普遍评估方面展现出了巨大的潜力,尽管存在一些限制,但其与人类的一致性以及提供详细解释的能力为通用自动评估器提供了希望。
Abstract
Automatically evaluating
vision-language tasks
is challenging, especially when it comes to reflecting human judgments due to limitations in accounting for fine-grained details. Although
gpt-4v
has shown promising
→