Gated Linear Units (GLUs) have become a common building block in modern foundation models. Bilinear layers drop the non-linearity in the "gate" but still have comparable performance to other GLUs. An attractive quality of bilinear layers is that they can be fully expressed in terms of a third-order tensor and linear operations. Leveraging this, we develop a method to decompose the bilinear tensor into a set of sparsely interacting eigenvectors that show promising interpretability properties in preliminary experiments for shallow image classifiers (MNIST) and small language models (Tiny Stories). Since the decomposition is fully equivalent to the model's original computations, bilinear layers may be an interpretability-friendly architecture that helps connect features to the model weights. Application of our method may not be limited to pretrained bilinear models since we find that language models such as TinyLlama-1.1B can be finetuned into bilinear variants.

利用三阶张量和线性操作的方法，将双线性层分解为一组稀疏交互特征向量，展示了在浅层图像分类器（MNIST）和小型语言模型（Tiny Stories）的初步实验中具有有希望的解释性性质。这种分解与模型原始计算完全等价，因此双线性层可能是一种具有解释性友好结构，可以将特征与模型权重相连接。我们的方法的应用不仅限于预训练的双线性模型，因为我们发现语言模型（如TinyLlama-1.1B）可以微调为双线性变体。

基于权重分解的双线性多层感知机案例