TL;DR本研究提出了一种基于 Transformer 的具有区分性的掩码预训练框架 MaskPoint,该框架使用离散的占用值表示点云,通过简单的二元分类来代理掩盖的对象点和采样的噪声点,从而使其具有鲁棒性。该预训练模型在多个下游任务中表现优异,包括 3D 形状分类、分割和真实世界物体检测。
Abstract
Masked autoencoding has achieved great success for self-supervised learning in the image and language domains. However, mask based pretraining has yet to show benefits for point cloud understanding, likely due to standard backbones like PointNet being unable to properly handle the trai