TL;DR该文章探讨了零标签学习在自然语言处理中的应用,介绍了基于预训练语言模型的一种新方法Unsupervised Data Generation,能够生成高质量的训练数据,无需人工标注。该方法使得用不带标签的数据训练具体任务的模型成为可能,并且当与带标签的数据混合使用时,能实现高效的数据增强,并达到了SuperGLUE基准测试的新最优结果。
Abstract
This paper explores zero-label learning in natural language processing (NLP), whereby no human-annotated data is used anywhere during training and models are trained purely on synthetic data. At the core of our f