Jul, 2023
视觉 Transformer 中的稀疏二次下降:真实还是虚幻的威胁?
Sparse Double Descent in Vision Transformers: real or phantom threat?
Victor Quétu, Marta Milovanovic, Enzo Tartaglione
TL;DRVision transformers are state-of-the-art models that use attention to identify key features in images, but their performance regarding sparse double descent and the optimal model size remains unknown.