BriefGPT.xyz
Jul, 2022
标题:标题值得1000张图片?控制学习的研究
Is a Caption Worth a Thousand Images? A Controlled Study for Representation Learning
HTML
PDF
Shibani Santurkar, Yann Dubois, Rohan Taori, Percy Liang, Tatsunori Hashimoto
TL;DR
通过比较图像和语言数据的传输表现,研究证明,当预训练数据集足够大而又包含了低变异性的描述性标题时,对于分类任务来说,仅使用图像的方法不能与CLIP的传输性能相匹配。
Abstract
The development of
clip
[Radford et al., 2021] has sparked a debate on whether
language supervision
can result in
vision models
with more
→