BriefGPT.xyz
May, 2023
从技术领域的术语和短语中提取文本表示
Extracting Text Representations for Terms and Phrases in Technical Domains
HTML
PDF
Francesco Fusco, Diego Antognini
TL;DR
本文提出了一种全无监督的文本编码方法,通过训练小型基于字符的模型重构预训练的嵌入矩阵,该方法不仅能够在技术领域匹配句子编码器的质量,而且比后者体积小5倍且速度高达10倍,解决了大规模技术领域下词汇量增加的问题。
Abstract
Extracting
dense representations
for terms and phrases is a task of great importance for
knowledge discovery
platforms targeting highly-technical fields.
→