BriefGPT.xyz
Nov, 2023
KBioXLM:一种基于知识锚定的多语言预训练生物医学语言模型
KBioXLM: A Knowledge-anchored Biomedical Multilingual Pretrained Language Model
HTML
PDF
Lei Geng, Xu Yan, Ziqiang Cao, Juntao Li, Wenjie Li...
TL;DR
通过在多语种预训练模型XLM-R基础上采用基于知识的方法,将其转化为生物医学领域的模型KBioXLM,并通过三个粒度的知识对齐来构建生物医学多语种语料库,从而在跨语言零样本场景中达到显著的性能提升。
Abstract
Most
biomedical pretrained language models
are monolingual and cannot handle the growing cross-lingual requirements. The scarcity of non-English domain corpora, not to mention parallel data, poses a significant hurdle in training
→