We consider the problem of domain generalization, in which a predictor is trained on data drawn from a family of related training domains and tested on a distinct and unseen test domain. While a variety of approaches have been proposed for this setting, it was recently shown that no existing algorithm can consistently outperform empirical risk minimization (ERM) over the training domains. To this end, in this paper we propose a novel approach for the domain generalization problem called Model-Based Domain Generalization. In our approach, we first use unlabeled data from the training domains to learn multi-modal domain transformation models that map data from one training domain to any other domain. Next, we propose a constrained optimization-based formulation for domain generalization which enforces that a trained predictor be invariant to distributional shifts under the underlying domain transformation model. Finally, we propose a novel algorithmic framework for efficiently solving this constrained optimization problem. In our experiments, we show that this approach outperforms both ERM and domain generalization algorithms on numerous well-known, challenging datasets, including WILDS, PACS, and ImageNet. In particular, our algorithms beat the current state-of-the-art methods on the very-recently-proposed WILDS benchmark by up to 20 percentage points.

本篇论文提出了一种基于模型的域泛化方法，通过对数据生成过程和同变性条件的建模，将域泛化问题转化为一个无限维的有约束统计学习问题，并利用非凸对偶理论发展了有约束松弛的统计问题，提出了具有收敛保证的域泛化算法，并在ColoredMNIST，Camelyon17-WILDS，FMoW-WILDS和PACS等基准测试中取得了高达30个百分点的改进。

基于模型的领域通用化