BriefGPT.xyz
Feb, 2024
利用反事实任务评估大型语言模型的类比推理能力
Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models
HTML
PDF
Martha Lewis, Melanie Mitchell
TL;DR
研究表明,尽管大型语言模型在类比推理方面表现出色,但它们缺乏人类类比能力的鲁棒性和普遍性。
Abstract
large language models
(LLMs) have performed well on several
reasoning
benchmarks, including ones that test analogical
reasoning
abilities.
→