BriefGPT.xyz
Feb, 2024
基于综合切分对大型语言模型进行落地:地鼠模型
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
HTML
PDF
Yichi Zhang, Ziqiao Ma, Xiaofeng Gao, Suhaila Shakiah, Qiaozi Gao...
TL;DR
用全貌分割为基础,GROUNDHOG将多模态大型语言模型连接到实体标记,从而优化了语言到对象的关联,提升了视觉理解和诊断能力。
Abstract
Most
multimodal large language models
(MLLMs) learn language-to-object
grounding
through causal language modeling where grounded objects are captured by bounding boxes as sequences of location tokens. This paradi
→