BriefGPT.xyz
Mar, 2021
稠密预测的视觉Transformer
Vision Transformers for Dense Prediction
HTML
PDF
René Ranftl, Alexey Bochkovskiy, Vladlen Koltun
TL;DR
本文提出了稠密视觉Transformer(dense vision transformers)作为密集预测任务的主干网络,相对于全卷积网络,该结构以恒定和较高的分辨率处理表示,并在每个阶段具有全局感受野。在单眼深度估计和语义分割任务上,我们的实验表明,该结构在有大量训练数据的情况下能够显着提高性能,是一种大有前途的新型神经网络结构。
Abstract
We introduce
dense vision transformers
, an architecture that leverages vision transformers in place of convolutional networks as a backbone for
dense prediction tasks
. We assemble tokens from various stages of th
→