BriefGPT.xyz
Jun, 2023
流式混淆网络语音识别
Streaming Speech-to-Confusion Network Speech Recognition
HTML
PDF
Denis Filimonov, Prabhat Pandey, Ariya Rastrow, Ankur Gandhe, Andreas Stolcke
TL;DR
本文提出了一种新型流式自动语音识别架构,可输出混淆网络并保持有限的延迟,以满足交互式应用的需要,其1-best结果与可比较的RNN-T系统相当,而更丰富的假设集允许进行第二遍重评分,以在LibriSpeech任务上实现10-20%更低的字词误差率,同时在远场语音助手任务中优于强RNN-T基线。
Abstract
In interactive automatic
speech recognition
(ASR) systems, low-latency requirements limit the amount of search space that can be explored during decoding, particularly in end-to-end
neural asr
. In this paper, we
→