BriefGPT.xyz
Jun, 2024
使用卷积注入器适应预训练ViTs的视觉动作控制
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
HTML
PDF
Dongyoon Hwang, Byungkun Lee, Hojoon Lee, Hyunseung Kim, Jaegul Choo
TL;DR
使用Convolution Injector(CoIn)给训练有素的Vision Transformers(ViTs)注入富含局部性和等变性的卷积,提高其在视觉运动控制方面的适应性和性能。
Abstract
vision transformers
(ViT), when paired with large-scale pretraining, have shown remarkable performance across various computer vision tasks, primarily due to their
weak inductive bias
. However, while such
→