This work concerns the path-star task, a minimal example of searching over a graph. The graph, $G$, is star-shaped with $D$ arms radiating from a start node, $s$. A language model (LM) is given $G$, $s$, and a target node $t$, which ends one of the arms and is tasked with generating the arm containing $t$. The minimal nature of this task means only a single choice needs to be made: which of the $D$ arms contains $t$? Decoder-only LMs fail to solve this elementary task above $1/D$ chance due to a learned shortcut that absorbs training supervision. We show how this pathology is caused by excess supervision and we present a series of solutions demonstrating that the task is solvable via decoder-only LMs. We find that the task's minimal nature causes its difficulty, as it prevents task decomposition. Our solutions provide insight into the pathology and its implications for LMs trained via next-token prediction.

本研究关注路径星任务，这是一个在图上搜索的简单示例。研究发现，解码器仅模型（LMs）在此任务中的效果不佳，原因在于过量的监督会导致学习到的捷径。通过提出一系列解决方案，证明该任务可以通过解码器仅模型有效解决，进而为相关语言模型的训练提供了新的见解。

语言模型、图搜索与监督污染：何时更多的监督反而更少，以及如何使更多的监督变得更有效