Dealing with unexpected words in automatic recognition of speech

作者: Hynek Hermansky

DOI: 10.1007/978-3-642-23538-2_1

关键词:

摘要: Unexpected words attract listener's attention. They are informationrich and getting them right is important for human communication. In the automatic recognition of speech (ASR), that not in expected lexicon machine typically substituted by some acoustically similar but nevertheless wrong words. The article discusses reasons this undesirable behavior machine, describes known examples dealing with unexpected perception their implications, proposes an alternative architecture ASR could alleviate problems acoustic inputs. Some published experimental results from using given.

参考文章(24)
Martin Karafiát, Hynek Hermansky, Stefan Kombrink, Lukás Burget, Pavel Matejka, Posterior-based Out of Vocabulary Word Detection in Telephone Speech conference of the international speech communication association. pp. 80- 83 ,(2009)
Hynek Hermansky, Mirko Hannemann, Hamed Ketabdar, Detection of out-of-vocabulary words in posterior based ASR. conference of the international speech communication association. pp. 1757- 1760 ,(2007)
Martin Karafiát, Stefan Kombrink, Lukás Burget, Mirko Hannemann, Similarity scoring for recognizing repeated out-of-vocabulary words. conference of the international speech communication association. pp. 897- 900 ,(2010)
Lin Lawrance Chase, Error-responsive feedback mechanisms for speech recognizers Carnegie Mellon University. ,(1997)
Stefan Kombrink, Mirko Hannemann, Lukáš Burget, Hynek Heřmanský, Recovery of rare words in lecture speech text speech and dialogue. pp. 330- 337 ,(2010) , 10.1007/978-3-642-15760-8_42
Arthur Boothroyd, Susan Nittrouer, Mathematical treatment of context effects in phoneme and word recognition Journal of the Acoustical Society of America. ,vol. 84, pp. 101- 114 ,(1988) , 10.1121/1.396976
Cyma Van Petten, Seana Coulson, Susan Rubin, Elena Plante, Marjorie Parks, Time course of word identification and semantic integration in spoken language. Journal of Experimental Psychology: Learning, Memory and Cognition. ,vol. 25, pp. 394- 417 ,(1999) , 10.1037/0278-7393.25.2.394
George A. Miller, George A. Heise, William Lichten, The intelligibility of speech as a function of the context of the test materials. Journal of Experimental Psychology. ,vol. 41, pp. 329- 335 ,(1951) , 10.1037/H0062491
Stephen V. David, Jonathan B. Fritz, Shihab A. Shamma, Nima Mesgarani, Erratum: “Phoneme representation and classification in primary auditory cortex” [J. Acoust. Soc. Am.123 (2), 899–909 (2008)] The Journal of the Acoustical Society of America. ,vol. 123, pp. 2433- 2433 ,(2008) , 10.1121/1.2907536