The Abstraction and Reasoning Corpus (ARC) (Chollet, 2019) and its most recent language-complete instantiation (LARC) has been postulated as an important step towards general AI. Yet, even state-of-the-art machine learning models struggle to achieve meaningful performance on these problems, falling behind non-learning based approaches. We argue that solving these tasks requires extreme generalization that can only be achieved by proper accounting for core knowledge priors. As a step towards this goal, we focus on geometry priors and introduce LatFormer, a model that incorporates lattice symmetry priors in attention masks. We show that, for any transformation of the hypercubic lattice, there exists a binary attention mask that implements that group action. Hence, our study motivates a modification to the standard attention mechanism, where attention weights are scaled using soft masks generated by a convolutional network. Experiments on synthetic geometric reasoning show that LatFormer requires 2 orders of magnitude fewer data than standard attention and transformers. Moreover, our results on ARC and LARC tasks that incorporate geometric priors provide preliminary evidence that these complex datasets do not lie out of the reach of deep learning models.

本文提出了LatFormer模型，将晶格对称乘法先验纳入注意力掩码中，以实现极端泛化，并在合成几何推理方面进行了实验证明，表明LatFormer需要比标准注意力和转换器少2个数量级的数据，并且对于包括几何先验的ARC和LARC任务的结果提供了初步证据，表明这些复杂的数据集不在深度学习模型的能力之外。

将晶格对称性先验注入注意力机制中，用于提高抽象几何推理的样本效率