Improving readability for automatic speech recognition transcription

作者： Junwei Liao , Sefik Eskimez , Liyang Lu , Yu Shi , Ming Gong

DOI:

关键词:

摘要: Modern Automatic Speech Recognition (ASR) systems can achieve high performance in terms of recognition accuracy. However, a perfectly accurate transcript still can be challenging to read due to grammatical errors, disfluency, and other noises common in spoken communication. These readable issues introduced by speakers and ASR systems will impair the performance of downstream tasks and the understanding of human readers. In this work, we present a task called ASR post-processing for readability (APR) and formulate it as a sequence-to-sequence text generation problem. The APR task aims to transform the noisy ASR output into a readable text for humans and downstream tasks while maintaining the semantic meaning of speakers. We further study the APR task from the benchmark dataset, evaluation metrics, and baseline models: First, to address the lack of task-specific data, we propose a method to …

acm.org 本地加速

arxiv.org PDF 下载加速

参考文章(0)

Improving readability for automatic speech recognition transcription

来源期刊

我的账户

Improving readability for automatic speech recognition transcription

来源期刊

相似文章 0

我的账户