In transfer learning, training and testing data sets are drawn from different data distributions. The transfer generalization gap is the difference between the population loss on the target data distribution and the training loss. The training data set generally includes data drawn from both source and target distributions. This work presents novel information-theoretic upper bounds on the average transfer generalization gap that capture (i) the domain shift between the target data distribution $P'_Z$ and and the source distribution $P_Z$ through Nielsen's family of $\alpha$-Jensen-Shannon (JS) divergences $D_{JS}^{\alpha}(P'_Z || P_Z)$; and (ii) the sensitivity of the transfer learner output $W$ to each individual sample of the data set $Z_i$ via the mutual information $I(W;Z_i)$. The $\alpha$-JS divergence is bounded even when the support of $P_Z$ is not included in that of $P'_Z$ . This contrasts the Kullback- Leibler (KL) divergence $D_{KL}(P_Z||P'_Z)$-based bounds of Wu et al. [1], which are vacuous under this assumption. Moreover, the obtained bounds hold for unbounded loss functions with bounded cumulant generating functions, unlike the $\phi$-divergence based bound of Wu et al. We also obtain new upper bounds on the average transfer excess risk in terms of the $\alpha$-JS divergence for empirical weighted risk minimization (EWRM), which minimizes the weighted average training losses over source and target data sets. Finally, we provide a numerical example to illustrate the merits of the introduced bounds.

本文介绍了一种基于信息理论的上限，以测量源和目标数据分布之间的差异，并将模型对每一个数据集样本的敏感性考虑在内。同时，对于加权风险最小化问题，提出了一种新的平均传输超额风险的上限。

基于Jensen-Shannon散度的转移泛化差信息论界限