Multimodal large language models have made significant advancements in recent
years, yet they still suffer from a common issue known as the "hallucination
problem" where the models generate textual descriptions that contain inaccurate
or non-existent content from the image. To address this issue, this paper
introduces a novel strategy: Hallucination-Aware Direct Preference Optimization
(HA-DPO). Our approach treats the hallucination problem as a unique preference
selection issue, where the model is trained to favor the non-hallucinating
response when presented with two responses of the same image (one accurate and
one hallucinating). This paper also presents an efficient process for
constructing hallucination sample pairs to ensure high-quality,
style-consistent pairs for stable HA-DPO training. We applied this strategy to
two mainstream multimodal models, and the results showed a significant
reduction in the hallucination problem and an enhancement in the models'
generalization capabilities. With HA-DPO, the MiniGPT-4 model demonstrates
significant advancements: POPE accuracy increases from 51.13% to 85.66% (34.5%
absolute improvement), and the MME score escalates from 968.58 to 1365.76 (41%
relative improvement). The code, models, and datasets will be made publicly
available.

这篇论文提出了一个新的策略：幻觉感知直接偏好优化（HA-DPO），通过训练模型在给定同一图像的两个回应（一个准确一个幻觉）时倾向于选择非幻觉回应，从而解决了多模式大型语言模型中存在的 “幻觉问题”。研究结果表明，在应用 HA-DPO 策略后，MiniGPT-4 模型的性能得到了显著提升。