Questions combine our mastery of language with our remarkable facility for reasoning about uncertainty. How do people navigate vast hypothesis spaces to pose informative questions given limited cognitive resources? We study these tradeoffs in a classic grounded question-asking task based on the board game Battleship. Our language-informed program sampling (LIPS) model uses large language models (LLMs) to generate natural language questions, translate them into symbolic programs, and evaluate their expected information gain. We find that with a surprisingly modest resource budget, this simple Monte Carlo optimization strategy yields informative questions that mirror human performance across varied Battleship board scenarios. In contrast, LLM-only baselines struggle to ground questions in the board state; notably, GPT-4V provides no improvement over non-visual baselines. Our results illustrate how Bayesian models of question-asking can leverage the statistics of language to capture human priors, while highlighting some shortcomings of pure LLMs as grounded reasoners.

使用大型语言模型生成自然语言问题，将其转化为符号程序，并评估其预期信息增益，从而在有限的认知资源下提出信息丰富的问题。结果表明，这种简单的蒙特卡洛优化策略可以在各种战舰游戏场景中产生与人类表现相似的有信息的问题，而纯语言模型则在将问题与游戏状态联系起来方面遇到一些困难。

战舰开火：自然语言指导下的采样式程序提问