Leveraging the compositional nature of our world to expedite learning and facilitate generalization is a hallmark of human perception. In machine learning, on the other hand, achieving compositional generalization has proven to be an elusive goal, even for models with explicit compositional priors. To get a better handle on compositional generalization, we here approach it from the bottom up: Inspired by identifiable representation learning, we investigate compositionality as a property of the data-generating process rather than the data itself. This reformulation enables us to derive mild conditions on only the support of the training distribution and the model architecture, which are sufficient for compositional generalization. We further demonstrate how our theoretical framework applies to real-world scenarios and validate our findings empirically. Our results set the stage for a principled theoretical study of compositional generalization.

本篇论文从识别可表达性学习出发，将组合性视为数据生成过程的属性而非数据本身，并提出了仅取决于训练分布和模型架构的轻微条件，为组合泛化建立了理论框架，并验证了其应用于现实世界场景的结论，从而为组合泛化的原则性理论研究铺平了道路。

从第一原则开始的组合泛化