Current decoder-based pre-trained language models (PLMs) successfully
demonstrate multilingual capabilities. However, it is unclear how these models
handle multilingualism. We analyze the neuron-level internal behavior of
multilingual decoder-based PLMs, Specifically examining the existence of
neurons that fire ``uniquely for each language'' within decoder-only
multilingual PLMs. We analyze six languages: English, German, French, Spanish,
Chinese, and Japanese, and show that language-specific neurons are unique, with
a slight overlap (< 5%) between languages. These neurons are mainly distributed
in the models' first and last few layers. This trend remains consistent across
languages and models. Additionally, we tamper with less than 1% of the total
neurons in each model during inference and demonstrate that tampering with a
few language-specific neurons drastically changes the probability of target
language occurrence in text generation.

当前基于解码器的预训练语言模型（PLMs）成功展示了多语言能力，但这些模型如何处理多语言仍不清楚。我们分析了多语言解码器 PLMs 的神经元级内部行为，特别是考察解码器 - 仅多语言 PLMs 内部是否存在 “独特地只为每种语言” 激活的神经元。我们分析了六种语言：英语、德语、法语、西班牙语、中文和日语，并显示每种语言的语言特定神经元是唯一的，在不同语言之间存在轻微的重叠（<5%）。这些神经元主要分布在模型的前几层和最后几层。此趋势在所有语言和模型中始终一致。此外，在推断过程中，我们对每个模型中少于 1% 的神经元进行干扰，并展示了对少数语言特定神经元的干扰会大幅改变生成文本中目标语言发生的概率。

基于解码器的预训练语言模型的多语言能力：发现和控制语言特定神经元

On the Multilingual Ability of Decoder-based Pre-trained Language  Models: Finding and Controlling Language-Specific Neurons

Large language models (LLMs) demonstrate remarkable performance across a
spectrum of languages. In this work, we delve into the question: How do LLMs
handle multilingualism? We introduce a framework that depicts LLMs' processing
of multilingual inputs: In the first several layers, LLMs understand the
question, converting multilingual inputs into English to facilitate the
task-solving phase. In the intermediate layers, LLMs engage in problem-solving
by thinking in English and incorporating multilingual knowledge to obtain
factual content, leveraging the self-attention and feed-forward structures,
respectively. In the last several layers, LLMs generate responses that align
with the original language of the query. In addition, we investigate the
existence of language-specific neurons when processing a certain language. To
detect neurons activated by the input language, even without labels, we
innovatively design a Parallel Language specific Neuron Detection
($\texttt{PLND}$) method that effectively measures the significance of neurons
when handling multilingual inputs. By comprehensive ablation analysis through
deactivating neurons of different layers and structures, we verify the
framework that we propose. Additionally, we demonstrate that we can utilize
such a framework to effectively enhance the multilingual ability with much less
training effort.

大型语言模型在跨多种语言表现出卓越的性能。本文探讨了大型语言模型处理多语言的方式，提出了一个处理多语言输入的框架，并利用该框架验证其有效性，并展示如何通过该框架有效提升多语言能力。

大型语言模型如何处理多语种能力？

How do Large Language Models Handle Multilingualism?

Large language models (LLMs) demonstrate remarkable multilingual capabilities
without being pre-trained on specially curated multilingual parallel corpora.
It remains a challenging problem to explain the underlying mechanisms by which
LLMs process multilingual texts. In this paper, we delve into the composition
of Transformer architectures in LLMs to pinpoint language-specific regions.
Specially, we propose a novel detection method, language activation probability
entropy (LAPE), to identify language-specific neurons within LLMs. Based on
LAPE, we conduct comprehensive experiments on two representative LLMs, namely
LLaMA-2 and BLOOM. Our findings indicate that LLMs' proficiency in processing a
particular language is predominantly due to a small subset of neurons,
primarily situated in the models' top and bottom layers. Furthermore, we
showcase the feasibility to "steer" the output language of LLMs by selectively
activating or deactivating language-specific neurons. Our research provides
important evidence to the understanding and exploration of the multilingual
capabilities of LLMs.

通过新的检测方法 - 语言激活概率熵（LAPE），我们研究了大型语言模型中的 Transformer 架构，以确定语言特定的区域，并显示了激活或关闭特定语言神经元对大型语言模型输出语言的可控性。