In many applications of machine learning, the training and test set data come from different distributions, or domains. A number of domain generalization strategies have been introduced with the goal of achieving good performance on out-of-distribution data. In this paper, we propose a