BriefGPT.xyz
May, 2024
从神经元到中子:可解释性的案例研究
From Neurons to Neutrons: A Case Study in Interpretability
HTML
PDF
Ouail Kitouni, Niklas Nolte, Víctor Samuel Pérez-Díaz, Sokratis Trifinopoulos, Mike Williams
TL;DR
高维神经网络通过理解机制可解释性的视角提供对低维表示的洞察力,并从中获得人类领域知识的相关见解。通过研究训练用于重现核数据的模型,我们提取出核物理概念作为一个案例研究。
Abstract
mechanistic interpretability
(MI) promises a path toward fully understanding how
neural networks
make their predictions. Prior work demonstrates that even when trained to perform simple arithmetic, models can imp
→