Large Vision Language Models exhibit remarkable capabilities but struggle with hallucinations inconsistencies between images and their descriptions. Previous hallucination evaluation studies on LVLMs have identified hallucinations in terms of objects, attributes, and relations but overlooked complex hallucinations that create an entire narrative around a fictional entity. In this paper, we introduce a refined taxonomy of hallucinations, featuring a new category: Event Hallucination. We then utilize advanced LLMs to generate and filter fine grained hallucinatory data consisting of various types of hallucinations, with a particular focus on event hallucinations, laying the groundwork for integrating discriminative and generative evaluation methods within our universal evaluation framework. The proposed benchmark distinctively assesses LVLMs ability to tackle a broad spectrum of hallucinations, making it a reliable and comprehensive tool for gauging LVLMs efficacy in handling hallucinations. We will release our code and data.

该研究提出了一种包括事件妄想在内的妄想细分分类方法，并利用高级视觉语言模型生成和过滤各类妄想数据，在通用评估框架中集成鉴别和生成式评估方法，从而评估大规模视觉语言模型处理妄想的能力，为评估妄想提供了可靠而全面的工具。

Hal-Eval:一个用于大型视觉语言模型的通用且精细的幻觉评估框架