辩证语言模型评估：对LLMs常识空间推理能力的初步评估

Apr, 2023

辩证语言模型评估：对LLMs常识空间推理能力的初步评估

Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs

Anthony G Cohn, Jose Hernandez-Orallo

TL;DR通过对语言模型进行对话交互式评估，将其在常识推理中的功能边界在空间推理方面进行了定性研究，并提出了未来改进语言模型能力和系统化对话评估的建议。

Abstract

language models have become very popular recently and many claims have been made about their abilities, including for commonsense reasoning. Given the increasingly better results of current →