KRLS：基于强化关键词学习的任务导向对话端到端应答生成改进

Nov, 2022

KRLS：基于强化关键词学习的任务导向对话端到端应答生成改进

Reinforced Language Modeling for End-to-End Task Oriented Dialog

Xiao Yu, Qingyang Wu, Kun Qian, Zhou Yu

TL;DR本文提出了一种新的训练算法Keywords Reinforcement Learning with Next-word Sampling (KRLS)，该算法利用强化学习来训练任务型对话模型中的关键词生成，同时避免了耗时的自回归式生成。实验证明KRLS算法在MultiWoZ基准数据集上能够达到最先进的表现，包括信息披露，成功率以及综合得分。

Abstract

In task-oriented dialogs such as MultiWoZ (Budzianowski et al., 2018), an informative and/or successful system response needs to include necessary key information such as the phone number of a hotel. Therefore, we hypothesize that by helping the model to focus more on learning key quan