TL;DR使用 SSFT 方法 LASER,通过自附着自监督表示的学习和对齐,通过少量的 GPU 精调获得相对于 ASR 和 PR 任务的显著改进。
Abstract
self-supervised learning (SSL)-based speech models are extensively used for
full-stack speech processing. However, it has been observed that improving
SSL-based speech representations using unlabeled speech for content-related
tasks is challenging and computationally expensive. Recent