Mar, 2024

Griffon v2: 提升高分辨率缩放和视觉语言共识的多模态感知

TL;DRGriffon v2, a high-resolution generalist model, overcomes image resolution limitations in large vision language models to achieve nuanced visual and language referring, and outperforms expert models in object detection and counting.