BriefGPT.xyz
Jan, 2024
通过扩展文本阅读理解提高领域适应性
Improving Domain Adaptation through Extended-Text Reading Comprehension
HTML
PDF
Ting Jiang, Shaohan Huang, Shengyue Luo, Zihan Zhang, Haizhen Huang...
TL;DR
通过使用领域特定的语料库继续预训练,结合基于正则表达式的模式处理阅读理解数据,以及引入LMM和聚类技术来增强阅读理解,本研究方法在领域特定任务上取得了超过5%的改进。
Abstract
To enhance the
domain-specific capabilities
of large language models, continued
pre-training
on a domain-specific corpus is a prevalent method. Recent work demonstrates that adapting models using
→