Strong student models can learn from weaker teachers: when trained on the predictions of a weaker model, a strong pretrained student can learn to correct the weak model's errors and generalize to examples where the teacher is not confident, even when these examples are excluded from training. This enables learning from cheap, incomplete, and possibly incorrect label information, such as coarse logical rules or the generations of a language model. We show that existing weak supervision theory fails to account for both of these effects, which we call pseudolabel correction and coverage expansion, respectively. We give a new bound based on expansion properties of the data distribution and student hypothesis class that directly accounts for pseudolabel correction and coverage expansion. Our bounds capture the intuition that weak-to-strong generalization occurs when the strong model is unable to fit the mistakes of the weak teacher without incurring additional error. We show that these expansion properties can be checked from finite data and give empirical evidence that they hold in practice.

强学生模型可以从较弱的教师那里学习：当在较弱模型的预测上进行训练时，强预先训练的学生可以学习纠正较弱模型的错误，并推广到教师不自信的例子，即使这些例子在训练中被排除在外。这使得可以从廉价、不完整和可能不正确的标签信息中进行学习，例如粗略的逻辑规则或语言模型的生成。我们证明了现有的弱监督理论不能同时解释这两个效应，我们将其称为伪标签纠正和覆盖扩展。我们给出了基于数据分布和学生假设类的展开性质的新界限，直接解释了伪标签纠正和覆盖扩展。我们的界限捕捉了弱到强泛化的直觉，即在强模型无法适应弱教师的错误而不产生额外错误时发生。我们展示了这些扩展性质可以通过有限数据进行检验，并提供了实证证据证明它们在实践中成立。

弱到强泛化的理论分析