We are interested in learning data-driven representations that can generalize well, even when trained on inherently biased data. In particular, we face the case where some attributes (bias) of the data, if learned by the model, can severely compromise its generalization properties. We tackle this problem through the lens of information theory, leveraging recent findings for a differentiable estimation of mutual information. We propose a novel end-to-end optimization strategy, which simultaneously estimates and minimizes the mutual information between the learned representation and the data attributes. When applied on standard benchmarks, our model shows comparable or superior classification performance with respect to state-of-the-art approaches. Moreover, our method is general enough to be applicable to the problem of ``algorithmic fairness'', with competitive results.

利用信息论的有关发现，我们提出了一种新的端到端优化策略，该策略同时估计和最小化学习表示和数据属性之间的互信息，通过这种策略，我们的模型在标准基准测试中表现出与最先进的方法相当或优越的分类性能，此方法可应用于问题的“算法公平性”，并得到了竞争性的结果。

通过互信息反向传播学习无偏表示