BriefGPT.xyz
Sep, 2020
能否混合?
Will it Unblend?
HTML
PDF
Yuval Pinter, Cassandra L. Jacobs, Jacob Eisenstein
TL;DR
本文中,我们通过对一个新数据集的实验来量化解释混合词的含义的难度,结果表明,BERT对这些混合词的处理不能充分访问其组成部分的含义,导致其上下文表示语义贫乏,而具有上下文感知能力的嵌入式系统在识别混合词的结构和恢复其来源方面表现优异,但其结果仍然远非令人满意。
Abstract
natural language processing
systems often struggle with out-of-vocabulary (OOV) terms, which do not appear in training data.
blends
, such as "innoventor", are one particularly challenging class of OOV, as they ar
→