BriefGPT.xyz
Oct, 2023
对多样化表格数据任务中统一序列到序列LLM预训练能力的测试
Testing the Limits of Unified Sequence to Sequence LLM Pretraining on Diverse Table Data Tasks
HTML
PDF
Soumajyoti Sarkar, Leonard Lausen
TL;DR
通过使用大型语言模型,我们尝试创建一个共享的建模方法,在预训练阶段使用编码器-解码器风格的大型语言模型,以适用于各种表格任务,并观察到自我监督目标的预训练可以显著提升模型在这些任务上的性能.
Abstract
Tables stored in databases and tables which are present in web pages and articles account for a large part of semi-structured data that is available on the internet. It then becomes pertinent to develop a
modeling approach
with
→