部署后从对话中学习：聊天机器人，喂饱自己！

Jan, 2019

部署后从对话中学习：聊天机器人，喂饱自己！

Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

Braden Hancock, Antoine Bordes, Pierre-Emmanuel Mazare, Jason Weston

TL;DR本研究提出自我反馈聊天机器人，通过从参与的对话中提取新的训练样本和估计用户满意度来改进聊天机器人的对话能力，并在 PersonaChat chit-chat 数据集上进行实验得到了显著的性能提升。

Abstract

The majority of conversations a dialogue agent sees over its lifetime occur after it has already been trained and deployed, leaving a vast store of potential training signal untapped. In this work, we propose the