神经语言模型的表示能力与思维链式推理

Jun, 2024

神经语言模型的表示能力与思维链式推理

On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning

Franz Nowak, Anej Svete, Alexandra Butoi, Ryan Cotterell

TL;DR现代语言模型的性能已通过思维链推理得到改进，思维链推理扩展了语言模型的计算能力，但也引入了类别错误，为此我们在概率模型中形式化思维链推理，并通过对序列生成模型的表示能力进行研究，证明它们可以表示与概率图灵机相同的字符串分布。

Abstract

The performance of modern language models (LMs) has been improved by chain-of-thought (CoT) reasoning, i.e., the process of generating intermediate results that guide the model towards a final answer. A possible explanation for this improvement is that CoT reasoning extends an LM's