BriefGPT.xyz
Aug, 2019
通过精简句子嵌入,可扩展的注意力句对建模
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding
HTML
PDF
Oren Barkan, Noam Razin, Itzik Malkiel, Ori Katz, Avi Caciularu...
TL;DR
该研究介绍一种基于知识蒸馏的Distilled Sentence Embedding(DSE)模型,旨在通过构建一个用于重构跨注意力模型得分的基于句子嵌入的学生模型,加速计算查询-候选句子对的相似度并在句子表示基准测试中达到最先进的性能。
Abstract
Attention based models have become the new state-of-the-art in
natural language understanding
tasks such as question-answering and sentence similarity. Recent models, such as
bert
and
→