BriefGPT.xyz
Dec, 2023
利用扩散模型和元提示进行视觉感知
Harnessing Diffusion Models for Visual Perception with Meta Prompts
HTML
PDF
Qiang Wan, Zilong Huang, Bingyi Kang, Jiashi Feng, Li Zhang
TL;DR
通过引入可学习的嵌入(元提示)来利用扩散模型解决视觉感知任务,我们的方法在深度估计和语义分割任务上取得了新的性能记录,并在ADE20K的语义分割和COCO数据集的姿态估计等方面达到了与最先进方法相媲美的结果,展示了其稳健性和多功能性。
Abstract
The issue of
generative pretraining
for vision models has persisted as a long-standing conundrum. At present, the text-to-image (T2I)
diffusion model
demonstrates remarkable proficiency in generating high-definit
→