BriefGPT.xyz
Jan, 2024
蒸馏模型中的对比学习
Contrastive Learning in Distilled Models
HTML
PDF
Valerie Lim, Kai Wen Ng, Kenneth Lim
TL;DR
使用SimCSE论文中的适用对比学习方法,将基于知识蒸馏模型DistilBERT的模型架构进行调整,以解决自然语言处理模型在语义文本相似度上效果不佳且过大无法部署为轻量级边缘应用的问题,最终得到的轻量级模型DistilFace在STS任务的Spearmans相关性上达到了72.1,相比BERT Base提升了34.2%。
Abstract
natural language processing
models like
bert
can provide state-of-the-art word embeddings for downstream NLP tasks. However, these models yet to perform well on
→