BriefGPT.xyz
Jul, 2023
因果探索的几何概念
A Geometric Notion of Causal Probing
HTML
PDF
Clément Guerner, Anej Svete, Tianyu Liu, Alexander Warstadt, Ryan Cotterell
TL;DR
大语言模型基于文本的实值表示进行预测,包括从训练数据中学到的语言特性和偏见(如性别)。本研究通过将表示空间的子空间进行正交投影进行了对这些概念的信息分析,并提出了概念受控生成的方法。实证结果表明,在至少一个模型中,R-LACE返回了包含约一半总概念信息的一维子空间,该子空间可用于精确操纵生成词的概念值。
Abstract
large language models
rely on
real-valued representations
of text to make their predictions. These representations contain information learned from the data that the model has trained on, including knowledge of <
→