3d dense captioning is a recently-proposed novel task, where point clouds contain more geometric information than the 2D counterpart. However, it is also more challenging due to the higher complexity and wider variety of inter-object relations. Existing methods only treat such relation