Jul, 2023
视觉Transformer中的稀疏二次下降:真实还是虚幻的威胁?
Sparse Double Descent in Vision Transformers: real or phantom threat?
TL;DRVision transformers are state-of-the-art models that use attention to identify key features in images, but their performance regarding sparse double descent and the optimal model size remains unknown.