We explore the topology of representation manifolds arising in autoregressive
neural language models trained on raw text data. In order to study their
properties, we introduce tools from computational algebraic topology, which we
use as a basis for a measure of topological complexity, that we call
perforation.
Using this measure, we study the evolution of topological structure in GPT
based large language models across depth and time during training. We then
compare these to gated recurrent models, and show that the latter exhibit more
topological complexity, with a distinct pattern of changes common to all
natural languages but absent from synthetically generated data. The paper
presents a detailed analysis of the representation manifolds derived by these
models based on studying the shapes of vector clouds induced by them as they
are conditioned on sentences from corpora of natural language text.
The methods developed in this paper are novel in the field and based on
mathematical apparatus that might be unfamiliar to the target audience. To help
with that we introduce the minimum necessary theory, and provide additional
visualizations in the appendices.
The main contribution of the paper is a striking observation about the
topological structure of the transformer as compared to LSTM based neural
architectures. It suggests that further research into mathematical properties
of these neural networks is necessary to understand the operation of large
transformer language models. We hope this work inspires further explorations in
this direction within the NLP community.

通过研究基于原始文本数据训练的自回归神经语言模型中出现的表示流形的拓扑属性，我们引入计算代数拓扑学的工具，使用其作为拓扑复杂度的度量标准（称为穿孔），以研究 GPT 的拓扑结构随深度和时间的演变，与门控循环模型进行比较，发现门控循环模型表现出更多的拓扑复杂性，并呈现了一种在所有自然语言中普遍存在但在合成生成数据中不存在的变化模式。该论文通过对这些模型在自然语言文本语料库中的句子条件下所引起的向量云的形状进行研究，详细分析了这些模型衍生的表示流形。该论文的主要贡献是关于 Transformer 与基于 LSTM 的神经网络架构的拓扑结构的显著观察，提示进一步研究这些神经网络的数学特性以理解大型 Transformer 语言模型的运作方式。我们希望这项工作能在自然语言处理领域激发更多对这个方向的探索。

隐蔽的洞：语言模型的拓扑学问题

Hidden Holes: topological aspects of language models

As artificial neural networks grow in complexity, understanding their inner
workings becomes increasingly challenging, which is particularly important in
healthcare applications. The intrinsic evaluation metrics of autoregressive
neural language models (NLMs), perplexity (PPL), can reflect how "surprised" an
NLM model is at novel input. PPL has been widely used to understand the
behavior of NLMs. Previous findings show that changes in PPL when masking
attention layers in pre-trained transformer-based NLMs reflect linguistic
anomalies associated with Alzheimer's disease dementia. Building upon this, we
explore a novel bidirectional attention head ablation method that exhibits
properties attributed to the concepts of cognitive and brain reserve in human
brain studies, which postulate that people with more neurons in the brain and
more efficient processing are more resilient to neurodegeneration. Our results
show that larger GPT-2 models require a disproportionately larger share of
attention heads to be masked/ablated to display degradation of similar
magnitude to masking in smaller models. These results suggest that the
attention mechanism in transformer models may present an analogue to the
notions of cognitive and brain reserve and could potentially be used to model
certain aspects of the progression of neurodegenerative disorders and aging.

神经网络的内在评估指标，困惑度（PPL），被广泛用于理解自回归神经语言模型（NLMs）的行为。该研究探索了一种新型的双向注意力头切除方法，其呈现了与人脑研究中认知和大脑储备概念相关的特性，暗示了转换器模型中的注意机制可能与神经退行性疾病和衰老的某些方面的进展有关。

太大而无法失败：较大规模的语言模型对痴呆相关语言异常的诱导具有不成比例的抵抗力

Too Big to Fail: Larger Language Models are Disproportionately Resilient  to Induction of Dementia-Related Linguistic Anomalies

Next-word predictions from autoregressive neural language models show
remarkable sensitivity to syntax. This work evaluates the extent to which this
behavior arises as a result of a learned ability to maintain implicit
representations of incremental syntactic structures. We extend work in
syntactic probing to the incremental setting and present several probes for
extracting incomplete syntactic structure (operationalized through parse states
from a stack-based parser) from autoregressive language models. We find that
our probes can be used to predict model preferences on ambiguous sentence
prefixes and causally intervene on model representations and steer model
behavior. This suggests implicit incremental syntactic inferences underlie
next-word predictions in autoregressive neural language models.

本研究评估了自回归神经语言模型在句法上对语法结构维护能力的学习程度，提出了几种用于从自回归语言模型中提取不完全的句法结构的探测器，并发现这些探测器可用于预测模型对于可疑前缀的偏好、对模型的表现进行因果干预，从而表明自回归神经语言模型的下一词预测中存在隐含的增量句法推理。