BriefGPT.xyz
Mar, 2022
用于多说话人端到端ASR的扩展图时序分类
Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
HTML
PDF
Xuankai Chang, Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux
TL;DR
使用基于图的时间分类(GTC)损失的通用形式来改善自动语音识别系统的性能,本研究提出了一个可以应用于更广泛任务的基于神经网络表示标签和标签转移的扩展GTC(GTC-e),并使用其来完成多说话人语音识别任务,最终得到了很有前景、接近任务经典基准的性能结果。
Abstract
graph-based temporal classification
(GTC), a generalized form of the connectionist temporal classification loss, was recently proposed to improve automatic
speech recognition
(ASR) systems using graph-based super
→