基于提示的蒙特卡罗树搜索用于面向目标的对话策略规划

May, 2023

基于提示的蒙特卡罗树搜索用于面向目标的对话策略规划

Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning

Xiao Yu, Maximillian Chen, Zhou Yu

TL;DR提出 GDP-Zero，该方法使用 Open-Loop MCTS 进行目标导向的对话策略规划，不需要进行任何模型训练，其响应在交互式评估中被认为是 ChatGPT 的 59.32%，而在说服力方面更有优势。

Abstract

Planning for goal-oriented dialogue often requires simulating future dialogue interactions and estimating task progress. Many approaches thus consider training neural networks to perform look-ahead search algorit