BriefGPT.xyz
Feb, 2024
思维链串联变压器解决本质上串行的问题
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
HTML
PDF
Zhiyuan Li, Hong Liu, Denny Zhou, Tengyu Ma
TL;DR
通过表达能力的角度,本文从理论上解释了串行思维链(CoT)对仅解码器的变压器的增强作用,通过对中间步骤(即CoT)的生成模型进行指导,可以显著提高大型语言模型在算术和符号推理任务上的准确性。
Abstract
Instructing the model to generate a sequence of intermediate steps, a.k.a., a
chain of thought
(CoT), is a highly effective method to improve the accuracy of
large language models
(LLMs) on arithmetics and
→