BriefGPT.xyz
Apr, 2024
探究视觉基础模型的三维认知能力
Probing the 3D Awareness of Visual Foundation Models
HTML
PDF
Mohamed El Banani, Amit Raj, Kevis-Kokitsi Maninis, Abhishek Kar, Yuanzhen Li...
TL;DR
最近大规模预训练的进步提供了具有强大功能的视觉基础模型。我们分析了视觉基础模型的三维感知能力,并通过一系列实验揭示了当前模型的几个局限性。
Abstract
Recent advances in
large-scale pretraining
have yielded
visual foundation models
with strong capabilities. Not only can recent models generalize to arbitrary images for their training task, their intermediate rep
→