BriefGPT.xyz
Jan, 2023
文本和3D点云的联合表示学习
Joint Representation Learning for Text and 3D Point Cloud
HTML
PDF
Rui Huang, Xuran Pan, Henry Zheng, Haojun Jiang, Zhifeng Xie...
TL;DR
本文提出了一种新型的Text4Point框架,通过利用2D图像作为连接点云和语言模态的桥梁,建立图像和点云的对应关系,从而通过对比学习将其对齐;并进一步引入文本查询模块,查询点云特征的文本嵌入,将语言信息整合到3D表示学习中,提高各种下游任务的性能。
Abstract
Recent advancements in
vision-language pre-training
(e.g.
clip
) have shown that vision models can benefit from language supervision. While many models using language modality have achieved great success on 2D vis
→