扩散模型在感知任务中的缩放特性

Nov, 2024

扩散模型在感知任务中的缩放特性

Scaling Properties of Diffusion Models for Perceptual Tasks

Rahul Ravishankar, Zeeshan Patel, Jathushan Rajasegaran, Jitendra Malik

TL;DR本文探讨了扩散模型在生成和视觉感知任务中的迭代计算能力，填补了扩散模型在深度估计、光流和条件分割等感知任务中的应用不足。通过分析扩散模型的缩放特性，提出了最优计算资源的训练和推理方法，从而在使用显著更少的数据和计算资源的情况下，实现了与最新方法相媲美的竞争性能。

Abstract

In this paper, we argue that iterative computation with Diffusion Models offers a powerful paradigm for not only generation but also visual perception tasks. We unify tasks such as depth estimation, optical flow, and amodal segmentation under the framework of →