Knowledge-Enhanced Model have developed a diverse set of techniques for knowledge integration on different knowledge sources. However, most previous work neglect the language model's own ability and simply concatenate external knowledge at the input. Recent work proposed that Feed Forward Network (FFN) in pre-trained language model can be seen as an memory that stored factual knowledge. In this work, we explore the FFN in Transformer and propose a novel knowledge fusion model, namely Kformer, which incorporates external knowledge through the feed-forward layer in Transformer. We empirically find that simply injecting knowledge into FFN can enhance the pre-trained language model's ability and facilitate current knowledge fusion methods. Our results on two benchmarks in the commonsense reasoning (i.e., SocialIQA) and medical question answering (i.e., MedQA-USMLE) domains demonstrate that Kformer can utilize external knowledge deeply and achieves absolute improvements in these tasks.

我们提出了一种简单的模型Kformer，它通过在Transformer的FFN层中注入来自PTMs和外部知识的信息，利用了PTMs存储的知识和内部的数量知识神经元。实验结果表明，在常识推理和医学问答等知识密集型任务中，Kformer的表现优于其他知识注入技术，如连接或基于注意力的注入。

Kformer：Transformer前馈层中的知识注入