Yue Wang, Vitor Guizilini, Tianyuan Zhang, Yilun Wang, Hang Zhao...
TL;DR提出了一种基于多摄像头的 3D 对象检测框架,使用基于上下文注意力的网络,直接在 3D 空间中进行边界框的预测,实现了全球最佳性能。
Abstract
We introduce a framework for multi-camera3d object detection. In contrast to existing works, which estimate 3D bounding boxes directly from monocular images or use depth prediction networks to generate input for