This study investigates the potential of Large Language Models (LLMs) for reconstructing and constructing the physical world solely based on textual knowledge. It explores the impact of model performance on spatial understanding abilities. To enhance the comprehension of geometric and spatial relationships in the complex physical world, the study introduces a set of geometric conventions and develops a workflow based on multi-layer graphs and multi-agent system frameworks. It examines how LLMs achieve multi-step and multi-objective geometric inference in a spatial environment using multi-layer graphs under unified geometric conventions. Additionally, the study employs a genetic algorithm, inspired by large-scale model knowledge, to solve geometric constraint problems. In summary, this work innovatively explores the feasibility of using text-based LLMs as physical world builders and designs a workflow to enhance their capabilities.

本研究解决了使用大语言模型（LLMs）重建和构建基于文本知识的物理世界的能力缺口。研究提出了一套几何约定，并基于多层图和多智能体系统框架开发了工作流程，以提高对复杂物理世界几何和空间关系的理解。重要发现是，LLMs能够在统一的几何约定下，通过多层图实现多步骤和多目标的几何推理。 

通过几何约束大语言模型导航复杂物理世界