BriefGPT.xyz
Jun, 2024
任何文本的分段:一种用于鲁棒、高效和适应性句子分割的通用方法
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation
HTML
PDF
Markus Frohmann, Igor Sterner, Ivan Vulić, Benjamin Minixhofer, Markus Schedl
TL;DR
利用新模型解决分隔文本中缺失标点符号的问题,并实现在不同领域的高效表现,以及在实际情况中文字格式不良的具有普适性的分词方法。
Abstract
segmenting
text
into sentences plays an early and crucial role in many
nlp systems
. This is commonly achieved by using rule-based or stati
→