Existing generalization theories of supervised learning typically take a holistic approach and provide bounds for the expected generalization over the whole data distribution, which implicitly assumes that the model generalizes similarly for all the classes. In practice, however, there are significant variations in generalization performance among different classes, which cannot be captured by the existing generalization bounds. In this work, we tackle this problem by theoretically studying the class-generalization error, which quantifies the generalization performance of each individual class. We derive a novel information-theoretic bound for class-generalization error using the KL divergence, and we further obtain several tighter bounds using the conditional mutual information (CMI), which are significantly easier to estimate in practice. We empirically validate our proposed bounds in different neural networks and show that they accurately capture the complex class-generalization error behavior. Moreover, we show that the theoretical tools developed in this paper can be applied in several applications beyond this context.

现有的监督学习泛化理论通常采用整体方法，并提供整个数据分布的期望泛化界限，这暗示了模型对所有类别的泛化情况相似。然而，在实践中，不同类别之间的泛化性能存在显著差异，这不能被现有的泛化界限所捕捉。本文通过在理论上研究类别泛化误差来解决这个问题，该误差量化了每个个体类别的泛化性能。我们使用KL散度推导出了一种新的信息论界限来度量类别泛化误差，并进一步利用条件互信息(CMI)获得了几个更紧的界限，这在实践中更容易估计。我们在不同的神经网络中经验证实了我们提出的界限能准确捕捉复杂的类别泛化误差行为。此外，我们还展示了本文所开发的理论工具能够应用在其他多个领域。

类别通用化误差：一种信息理论分析