通过蒙特卡罗方法定制预训练语言模型：你有合适的剪刀吗？

Jul, 2020

通过蒙特卡罗方法定制预训练语言模型：你有合适的剪刀吗？

Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods

Ning Miao, Yuxuan Song, Hao Zhou, Lei Li

TL;DR本文提出了一种名为MC-Tailor的方法，通过在文本生成任务中将概率质量从过估计到低估计的区域进行截断和转移，缓解了在小数据集上进行预训练模型的微调可能导致的过估计和/或低估计问题，并在各种文本生成数据集上进行实验证明了其显著优于微调方法。

Abstract

It has been a common approach to pre-train a language model on a large corpus and fine-tune it on task-specific data. In practice, we observe that fine-tuning a pre-trained model on a small dataset may lead to over- and/or under-estimation problem. In this paper, we propose MC-Tailor,