BriefGPT.xyz
Apr, 2025
量化常识推理的机制洞察
Towards Quantifying Commonsense Reasoning with Mechanistic Insights
HTML
PDF
Abhinav Joshi, Areeb Ahmad, Divyaksh Shukla, Ashutosh Modi
TL;DR
本研究解决了当前对大型语言模型(LLMs)常识推理能力评估的不足,通过创建图形结构的注释方案捕捉隐性知识,并针对37种日常人类活动进行 rigorous 评估。研究发现,这一资源可以生成大量常识查询,并揭示了 LLMs 中推理组件的本地化特征,对于理解这些模型的决策过程具有重要意义。
Abstract
Commonsense Reasoning
deals with the
Implicit Knowledge
that is well understood by humans and typically acquired via interactions with the world. In recent times,
→