Query rewriting is a crucial technique for passage retrieval in open-domain conversational question answering (CQA). It decontexualizes conversational queries into self-contained questions suitable for off-the-shelf retrievers. Existing methods attempt to incorporate retriever's preference during the training of rewriting models. However, these approaches typically rely on extensive annotations such as in-domain rewrites and/or relevant passage labels, limiting the models' generalization and adaptation capabilities. In this paper, we introduce AdaQR ($\textbf{Ada}$ptive $\textbf{Q}$uery $\textbf{R}$ewriting), a framework for training query rewriting models with limited rewrite annotations from seed datasets and completely no passage label. Our approach begins by fine-tuning compact large language models using only ~$10\%$ of rewrite annotations from the seed dataset training split. The models are then utilized to generate rewrite candidates for each query instance. A novel approach is then proposed to assess retriever's preference for these candidates by the probability of answers conditioned on the conversational query by marginalizing the Top-$K$ passages. This serves as the reward for optimizing the rewriter further using Direct Preference Optimization (DPO), a process free of rewrite and retrieval annotations. Experimental results on four open-domain CQA datasets demonstrate that AdaQR not only enhances the in-domain capabilities of the rewriter with limited annotation requirement, but also adapts effectively to out-of-domain datasets.

AdaQR 是一个框架，用于训练具有有限重写注释和完全没有段落标签的查询重写模型，通过从种子数据集中只使用 ~10% 的重写注释进行微调精简大型语言模型，然后利用这些模型为每个查询实例生成重写候选，并通过条件概率对这些候选进行检索者权重评估，这被用作优化重写器的奖励，进一步使用直接偏好优化 (DPO) 的过程进行优化，实验结果表明 AdaQR 不仅增强了具有有限注释要求的领域内重写器的功能，而且有效地适应了领域外数据集。

自适应查询重写：通过会话答案的边际概率对齐重写器