BriefGPT.xyz
Jan, 2022
对比式预训练用于文本与代码嵌入
Text and Code Embeddings by Contrastive Pre-Training
HTML
PDF
Arvind Neelakantan, Tao Xu, Raul Puri, Alec Radford, Jesse Michael Han...
TL;DR
本文研究了使用对比方式的自监督无监督训练大规模文本向量化,得到的向量在文本与代码搜索中表现优异,相较于之前使用监督学习方法的实验结果,相对提升了4%到23.4%不等。
Abstract
text embeddings
are useful features in many applications such as
semantic search
and computing text similarity. Previous work typically trains models customized for different use cases, varying in dataset choice,
→