Transformer language models are neural networks used for a wide variety of tasks concerning natural language, including some that also require logical reasoning. However, a transformer model may easily learn spurious patterns in the data, short-circuiting actual reasoning. In this paper we investigate to what extent transformers can be trained to a) approximate reasoning in propositional logic while b) avoiding known reasoning shortcuts via spurious correlations in the training data. To do so, we use a dataset with known spurious correlation between truth and e.g. the number of rules in the problem. We augment the data with proofs, and train two models: a generative transformer, WP-BART, trained on problems and their whole proofs, and a neuro-symbolic model, SIP-BART, trained on individual proof steps and combining the generative transformer model BART with a symbolic proof checker. We find that SIP-BART succeeds in avoiding reasoning shortcuts, while WP-BART does not. For SIP-BART, we then identify a few remaining reasoning errors, not previously described in the literature, arising from using a pre-trained language model. These are qualitatively analysed to create a taxonomy of four different types of additional pitfalls.

用已知有误导性关联的数据集，在逻辑推理任务中训练两种模型：基于证明的生成式 Transformer 模型 WP-BART 和神经符号模型 SIP-BART。结果发现，SIP-BART 能够避免逻辑推理的捷径，而 WP-BART 无法。对于 SIP-BART，还发现了几种之前文献中未描述的推理错误类型，并进行了定性分析，创建了一个包含四种不同陷阱类型的分类系统。

Transformer中的推理：减轻伪相关性和推理捷径