We propose DiffusionDet, a new framework that formulates object detection as a denoising diffusion process from noisy boxes to object boxes. During training stage, object boxes diffuse from ground-truth boxes to random distribution, and the model learns to reverse this noising process. In inference, the model refines a set of randomly generated boxes to the output results in a progressive way. The extensive evaluations on the standard benchmarks, including MS-COCO and LVIS, show that DiffusionDet achieves favorable performance compared to previous well-established detectors. Our work brings two important findings in object detection. First, random boxes, although drastically different from pre-defined anchors or learned queries, are also effective object candidates. Second, object detection, one of the representative perception tasks, can be solved by a generative way. Our code is available at https://github.com/ShoufaChen/DiffusionDet.

DiffusionDet是一种将对象检测作为从噪声框到对象框的去噪扩散过程的新框架，其在训练阶段通过对象框从地面实况框扩散到随机分布，模型学习将该过程反转，在推断阶段，模型以渐进方式将一组随机生成框细化为输出结果，使用随机框作为对象候选框有利于解决对象检测问题，并且该问题可以通过生成方法来解决。

DiffusionDet: 目标检测的扩散模型