Existing speech recognition systems are typically built at the sentence
level, although it is known that dialog context, e.g. higher-level knowledge
that spans across sentences or speakers, can help the processing of long
conversations. The recent progress in end-to-end speech recognit