Generalization of models to out-of-distribution (OOD) data has captured
tremendous attention recently. Specifically, compositional generalization,
i.e., whether a model generalizes to new structures built of components
observed during training, has sparked substantial interest. In this