The capabilities of large language models (LLMs) have sparked debate over whether such systems just learn an enormous collection of superficial statistics or a coherent model of the data generating process -- a world model. We find evidence for the latter by analyzing the learned representations of three spatial datasets (world, US, NYC places) and three temporal datasets (historical figures, artworks, news headlines) in the Llama-2 family of models. We discover that LLMs learn linear representations of space and time across multiple scales. These representations are robust to prompting variations and unified across different entity types (e.g. cities and landmarks). In addition, we identify individual ``space neurons'' and ``time neurons'' that reliably encode spatial and temporal coordinates. Our analysis demonstrates that modern LLMs acquire structured knowledge about fundamental dimensions such as space and time, supporting the view that they learn not merely superficial statistics, but literal world models.

使用Llama-2模型，我们通过分析三个空间数据集（全球、美国、纽约地点）和三个时间数据集（历史人物、艺术品、新闻标题）中学到的表示来找到LLMs学习的证据，发现LLMs在多个尺度上学习了空间和时间的线性表示，表征对提示的变化具有鲁棒性，并且跨不同实体类型（例如城市和地标）统一。此外，我们还确定了可靠地编码空间和时间坐标的个别“空间神经元”和“时间神经元”。我们的分析证明了现代LLMs获取了关于空间和时间等基本维度的结构化知识，支持它们不仅仅学习了表面统计数据，而是字面上的世界模型。

语言模型表示空间和时间