Correcting recognition errors via discriminative utterance verification

作者: A.R. Setlur , R.A. Sukkar , J. Jacob

DOI: 10.1109/ICSLP.1996.607433

关键词: Process (computing)Word error rateError detection and correctionPattern recognitionTask (project management)Discriminative modelArtificial intelligenceVoice activity detectionUtteranceSpeech recognitionBinary decision diagramComputer science

摘要: Utterance verification (UV) is a process by which the output of speech recognizer verified to determine if input actually includes recognized keyword(s). The verifier binary decision accept or reject utterance based on UV confidence score. In this paper, we extend notion not only detect errors but also selectively correct them. We perform error correction flipping hypotheses produced an N-best in cases when top candidate has score that lower than next candidate. propose two measures for computing scores and investigate use hybrid measure combines into single Using algorithm, obtained 11% improvement word-error rate connected digit recognition task. This was achieved while still maintaining reliable detection non-keyword misrecognitions.

参考文章(9)
F. Javier Caminero-Gil, Cesar Martín del Alamo, Celinda de la Torre-Munilla, Luis A. Hernández Gómez, New n-best based rejection techniques for improving a real-time telephonic connected word recognition system. conference of the international speech communication association. ,(1995)
M.G. Rahim, Chin-Hui Lee, Biing-Hwang Juang, Robust utterance verification for connected digits recognition international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 285- 288 ,(1995) , 10.1109/ICASSP.1995.479529
F.K. Soong, E.-F. Huang, A tree-trellis based fast search for finding the N-best sentence hypotheses in continuous speech recognition international conference on acoustics, speech, and signal processing. pp. 705- 708 ,(1991) , 10.1109/ICASSP.1991.150437
Chin‐Hui Lee, Wu Chou, Biing‐Hwang Juang, Lawrence R. Rabiner, Jay G. Wilpon, Context‐dependent acoustic subword modeling for connected digit recognition Journal of the Acoustical Society of America. ,vol. 94, pp. 1798- 1798 ,(1993) , 10.1121/1.407918
M. Weintraub, LVCSR log-likelihood ratio scoring for keyword spotting international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 297- 300 ,(1995) , 10.1109/ICASSP.1995.479532
M.G. Rahim, Chin-Hui Lee, Biing-Hwang Juang, Wu Chou, Discriminative utterance verification using minimum string verification error (MSVE) training international conference on acoustics speech and signal processing. ,vol. 6, pp. 3585- 3588 ,(1996) , 10.1109/ICASSP.1996.550804
R.A. Sukkar, A.R. Setlur, M.G. Rahim, Chin-Hui Lee, Utterance verification of keyword strings using word-based minimum verification error (WB-MVE) training international conference on acoustics speech and signal processing. ,vol. 1, pp. 518- 521 ,(1996) , 10.1109/ICASSP.1996.541147
R.A. Sukkar, J.G. Wilpon, A two pass classifier for utterance rejection in keyword spotting IEEE International Conference on Acoustics Speech and Signal Processing. ,vol. 2, pp. 451- 454 ,(1993) , 10.1109/ICASSP.1993.319338
W. Chou, B.H. Juang, C.H. Lee, Segmental GPD training of HMM based speech recognizer [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing. ,vol. 1, pp. 473- 476 ,(1992) , 10.1109/ICASSP.1992.225869