This study investigates the linguistic understanding of Large Language Models (LLMs) regarding signifier (form) and signified (meaning) by distinguishing two LLM evaluation paradigms: psycholinguistic and neurolinguistic. Traditional psycholinguistic evaluations often reflect statistical biases that may misrepresent LLMs' true linguistic capabilities. We introduce a neurolinguistic approach, utilizing a novel method that combines minimal pair and diagnostic probing to analyze activation patterns across model layers. This method allows for a detailed examination of how LLMs represent form and meaning, and whether these representations are consistent across languages. Our contributions are three-fold: (1) We compare neurolinguistic and psycholinguistic methods, revealing distinct patterns in LLM assessment; (2) We demonstrate that LLMs exhibit higher competence in form compared to meaning, with the latter largely correlated to the former; (3) We present new conceptual minimal pair datasets for Chinese (COMPS-ZH) and German (COMPS-DE), complementing existing English datasets.

本研究解决了大型语言模型（LLMs）在符号（形式）和所指（意义）方面的语言理解差距。通过介绍一种新的神经语言学方法，结合最小对和诊断性探测，分析模型层的激活模式，我们发现LLMs在形式方面的能力优于意义，且意义的表现主要与形式相关。我们的研究为其在多语言环境中的表现提供了新的数据集，深化了对LLMs能力的理解。

大型语言模型作为神经语言学主体：识别形式和意义的内部表现