One way to address safety risks from large language models (LLMs) is to
censor dangerous knowledge from their training data. While this removes the
explicit information, implicit information can remain scattered across various
training documents. Could an LLM infer the censored knowledge by piecing
together these implicit hints? As a step towards answering this question, we
study inductive out-of-context reasoning (OOCR), a type of generalization in
which LLMs infer latent information from evidence distributed across training
documents and apply it to downstream tasks without in-context learning. Using a
suite of five tasks, we demonstrate that frontier LLMs can perform inductive
OOCR. In one experiment we finetune an LLM on a corpus consisting only of
distances between an unknown city and other known cities. Remarkably, without
in-context examples or Chain of Thought, the LLM can verbalize that the unknown
city is Paris and use this fact to answer downstream questions. Further
experiments show that LLMs trained only on individual coin flip outcomes can
verbalize whether the coin is biased, and those trained only on pairs
$(x,f(x))$ can articulate a definition of $f$ and compute inverses. While OOCR
succeeds in a range of cases, we also show that it is unreliable, particularly
for smaller LLMs learning complex structures. Overall, the ability of LLMs to
"connect the dots" without explicit in-context learning poses a potential
obstacle to monitoring and controlling the knowledge acquired by LLMs.

大型语言模型面临的安全风险可以通过从训练数据中删除危险知识来解决，但隐式信息可能仍然分布在各个训练文档中，我们研究了一种称为归纳无上下文推理的泛化类型，通过从训练文档中分散的证据中推断潜在信息并将其应用于下游任务，并展示了大型语言模型可以执行归纳无上下文推理。

连接事实：LLMs 可以从不同的训练数据推理和表达潜在结构

Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from  Disparate Training Data

The semantics of a text is manifested not only by what is read, but also by
what is not read. In this article, we will study how the implicit "not read"
information such as end-of-paragraph (\eop) and end-of-sequence (\eos) affect
the quality of text generation. Specifically, we find that the pre-trained
language model GPT2 can generate better continuations by learning to generate
the \eop in the fine-tuning stage. Experimental results on English story
generation show that \eop can lead to higher BLEU score and lower \eos
perplexity. We also conduct experiments on a self-collected Chinese essay
dataset with Chinese-GPT2, a character level LM without \eop or \eos during
pre-training. Experimental results show that the Chinese GPT2 can generate
better essay endings with \eop.

研究隐式信息如何影响文本生成质量，并发现使用预训练语言模型 GPT2 可以通过在微调阶段学习生成段落结束符以获得更好的文本连续性，其在生成英语故事和中文文章方面实现了比较好的实验结果。