Generating informative responses in end-to-end neural dialogue systems attracts a lot of attention in recent years. Various previous work leverages external knowledge and the dialogue contexts to generate such responses. Nevertheless, few has demonstrated their capability on incorporat