Large language models (LLMs) have demonstrated remarkable proficiency in understanding and generating responses to complex queries through large-scale pre-training. However, the efficacy of these models in memorizing and reasoning among large-scale structured knowledge, especially world knowledge that explicitly covers abundant factual information remains questionable. Addressing this gap, our research investigates whether LLMs can effectively store, recall, and reason with knowledge on a large scale comparable to latest knowledge bases (KBs) such as Wikidata. Specifically, we focus on three crucial aspects to study the viability: (1) the efficiency of LLMs with different sizes in memorizing the exact knowledge in the large-scale KB; (2) the flexibility of recalling the memorized knowledge in response to natural language queries; (3) the capability to infer new knowledge through reasoning. Our findings indicate that while LLMs hold promise as large-scale KBs capable of retrieving and responding with flexibility, enhancements in their reasoning capabilities are necessary to fully realize their potential.

大型语言模型在理解和生成复杂查询方面表现出了卓越的能力，但它们在存储、回忆和推理大规模结构化知识方面的有效性仍有待研究，本研究发现虽然大型语言模型有望作为大规模知识库灵活地检索和回答问题，但需要增强它们的推理能力以充分发挥潜力。

语言模型在规模上是否能充当知识库？