BriefGPT.xyz
May, 2021
划分与对比:自监督学习从未审查的数据中学习
Divide and Contrast: Self-supervised Learning from Uncurated Data
HTML
PDF
Yonglong Tian, Olivier J. Henaff, Aaron van den Oord
TL;DR
本文研究自监督学习在大规模数据集上的应用,提出了一种基于对比学习与聚类的硬负样本挖掘方法(DnC),在 less-curated 数据集上的预训练,可以显著提高自监督学习在后续任务上的表现效果,并与目前在高度筛选数据集上的最新水平保持竞争力。
Abstract
self-supervised learning
holds promise in leveraging large amounts of unlabeled data, however much of its progress has thus far been limited to highly curated pre-training data such as ImageNet. We explore the effects of
→