Quality estimation of hybrid transcription of audio

作者: Ben Tsvi Yaakov Kobi , Getz Iris , Livne Tom , Shellef Eric Ariel , Rosensweig Elisha Yehuda

DOI:

关键词:

摘要: Hybrid transcription of audio relies on having one or more layers transcribers who review transcriptions generated by automatic speech recognition (ASR) systems in order to correct errors that are found the transcriptions. When it comes determining how much human reviewing is needed, such as many use, there a cost/benefit tradeoff consider. Some embodiments described herein utilize machine learning-based approach for estimating quality hybrid audio. In embodiment, computer generates segment using an ASR system, which subsequently reviewed transcriber. The then calculates, based properties transcriber, value indicative expected accuracy transcription. may suggest second transcriber if below threshold.

参考文章(74)
Hynek Hermansky, Aren Jansen, Kenneth Church, Towards Spoken Term Discovery At Scale With Zero Resources conference of the international speech communication association. pp. 1676- 1679 ,(2010)
Richard Sproat, Murat Saraclar, Lattice-Based Search for Spoken Utterance Retrieval north american chapter of the association for computational linguistics. pp. 129- 136 ,(2004)
Sara H. Basson, Peter G. Fairweather, Dimitri Kanevsky, Integration of speech recognition and stenographic services for improved ASR training ,(2000)