BriefGPT.xyz
Feb, 2021
监督对比学习剖析
Dissecting Supervised Constrastive Learning
HTML
PDF
Florian Graf, Christoph D. Hofer, Marc Niethammer, Roland Kwitt
TL;DR
本研究探讨了在最小化损失时,编码器输出空间内所寻求的类别性空间几何是否存在本质差异。同时提供实证证据表明,两种损失函数的优化行为存在显著不同,这将对神经网络的训练产生影响。
Abstract
Minimizing cross-entropy over the softmax scores of a linear map composed with a high-capacity encoder is arguably the most popular choice for training
neural networks
on
supervised learning
tasks. However, recen
→