学习算法泛化能力的信息论分析

May, 2017

Information-theoretic analysis of generalization capability of learning algorithms

Aolin Xu, Maxim Raginsky

TL;DR本研究提出了一种基于信息理论的泛化误差上界方法，用以控制模型的输入输出互信息，进而指导在数据适配和泛化之间寻找平衡点。在此基础上，我们探索了一些方法，包括利用相对熵或随机噪声来正则化ERM算法等。这些方法扩展和改进了Russo和Zou的最近工作。

Abstract

We derive upper bounds on the generalization error of a learning algorithm in terms of the mutual information between its input and output