基于REO准则的图像描述生成细粒度评估

Sep, 2019

基于REO准则的图像描述生成细粒度评估

REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning

Ming Jiang, Junjie Hu, Qiuyuan Huang, Lei Zhang, Jana Diesner...

TL;DR本研究提出了细粒度评估方法REO，从与实际参照的相关性、多余性和缺少性三个方面评估图像字幕系统的性能，实验证明与人类判断更具一致性，结果更直观。

Abstract

Popular metrics used for evaluating image captioning systems, such as BLEU and CIDEr, provide a single score to gauge the system's overall effectiveness. This score is often not informative enough to indicate what specific errors are made by a given system. In this study, we present a