The perception of transparent objects for grasp and manipulation remains a major challenge, because existing robotic grasp methods which heavily rely on depth maps are not suitable for transparent objects due to their unique visual properties. These properties lead to gaps and inaccuracies in the depth maps of the transparent objects captured by depth sensors. To address this issue, we propose an end-to-end network for transparent object depth completion that combines the strengths of single-view RGB-D based depth completion and multi-view depth estimation. Moreover, we introduce a depth refinement module based on confidence estimation to fuse predicted depth maps from single-view and multi-view modules, which further refines the restored depth map. The extensive experiments on the ClearPose and TransCG datasets demonstrate that our method achieves superior accuracy and robustness in complex scenarios with significant occlusion compared to the state-of-the-art methods.

我们提出了一种透明物体深度补全的端到端网络，结合了基于单视图RGB-D的深度补全和多视图深度估计的优点，并引入了基于置信度估计的深度细化模块，进一步改进了恢复的深度图。在ClearPose和TransCG数据集上进行的大量实验证明，与现有方法相比，我们的方法在具有显著遮挡的复杂场景中实现了更高的准确性和鲁棒性。