Large Language Models (LLMs) have demonstrated remarkable performance through supervised fine-tuning or in-context learning using gold labels. However, this paradigm is limited by the availability of gold labels, while in certain scenarios, LLMs may need to perform tasks that are too complex for humans to provide such labels. To tackle this challenge, this study explores whether solely utilizing unlabeled data can elicit strong model capabilities. We propose a new paradigm termed zero-to-strong generalization. We iteratively prompt LLMs to annotate unlabeled data and retain high-quality labels by filtering. Surprisingly, we obverse that this iterative process gradually unlocks LLMs' potential on downstream tasks. Our experiments on extensive classification and reasoning tasks confirm the effectiveness of our proposed framework. Our analysis indicates that this paradigm is effective for both in-context learning and fine-tuning, and for various model sizes.

本研究解决了大型语言模型（LLMs）在缺乏黄金标签时面临的能力限制问题。提出的“从零到强的泛化”新范式，通过迭代提示LLMs对未标注数据进行注释并保留高质量标签，显著提升了模型在下游任务上的表现。实验结果表明，该方法对多种模型尺寸、上下文学习及微调均有效。

从零到强的泛化：在没有黄金标签的情况下迭代引发大型语言模型的强能力