BriefGPT.xyz
Sep, 2020
无监督句子嵌入方法:通过最大化互信息实现
An Unsupervised Sentence Embedding Method byMutual Information Maximization
HTML
PDF
Yan Zhang, Ruidan He, Zuozhu Liu, Kwan Hui Lim, Lidong Bing
TL;DR
本研究提出了一种在BERT基础上的轻量级扩展和基于互信息最大化策略的自监督学习目标,以无监督的方式生成有意义的句子嵌入。实验结果表明,该方法在常见的STS任务和下游监督任务中显著优于其他无监督句子嵌入基线,并在没有领域内标注数据的情况下超过了SBERT。
Abstract
bert
is inefficient for sentence-pair tasks such as clustering or semantic search as it needs to evaluate combinatorially many sentence pairs which is very time-consuming. Sentence
bert
(
→