Dec, 2023
挑战 GPT-4V?对 Gemini 在视觉专长方面的早期探索
A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise
Chaoyou Fu, Renrui Zhang, Haojia Lin, Zihan Wang, Timin Gao...
TL;DRGemini Pro is explored as a challenger to GPT-4V in multi-modal learning, showcasing comparable visual reasoning capabilities but with different answering styles and preferences, while Sphinx lags behind in domain generalizability; Gemini has the potential to be a strong contender according to quantitative evaluation on the MME benchmark.