BriefGPT.xyz
Jun, 2023
ViViT 训练优化:行动识别的时间和内存减少
Optimizing ViViT Training: Time and Memory Reduction for Action Recognition
HTML
PDF
Shreyank N Gowda, Anurag Arnab, Jonathan Huang
TL;DR
本文提出了一种训练策略,可以降低视频transformers的训练时间和内存消耗,通过对ViViT的编码器变体进行修正实现冻结空间transformer的效果并提高准确率,最终在6个基准测试中减少了50%的训练成本和内存消耗,同时保持或略微改善模型性能。
Abstract
In this paper, we address the challenges posed by the substantial training time and memory consumption associated with
video transformers
, focusing on the
vivit
(Video Vision Transformer) model, in particular the
→