Breadth-first search for finding the optimal phonetic transcription from multiple utterances.

作者: Maximilian Bisani , Hermann Ney

DOI:

关键词:

摘要: Extending the vocabulary of a large speech recognition system usually requires phonetic transcriptions for all words to be known. With automatic baseform determination acoustic samples in question can substitute required expert knowledge. In this paper we follow probabilitistic approach problem and present novel breadth-first search algorithm which takes full advantage multiple samples. An extension genereate phone graphs as well an EM based iteration scheme estimating stochastic pronunciation models is presented. preliminary experiments phoneme error rates below 5% with respect standard are achieved without language or word specific prior

参考文章(6)
Torbjørn Svendsen, Frank K. Soong, Heiko Purnhagen, Optimizing baseforms for HMM-based speech recognition. conference of the international speech communication association. ,(1995)
Hermann Ney, Stephan Kanthak, Achim Sixtus, Sirko Molau, Ralf Schlüter, Fast Search for Large Vocabulary Speech Recognition Artificial Intelligence. pp. 63- 78 ,(2000) , 10.1007/978-3-662-04230-4_5
F. Wessel, R. Schluter, K. Macherey, H. Ney, Confidence measures for large vocabulary continuous speech recognition IEEE Transactions on Speech and Audio Processing. ,vol. 9, pp. 288- 298 ,(2001) , 10.1109/89.906002
L.R. Bahl, P.F. Brown, P.V. de Souza, R.L. Mercer, M.A. Picheny, A method for the construction of acoustic Markov models for words IEEE Transactions on Speech and Audio Processing. ,vol. 1, pp. 443- 452 ,(1993) , 10.1109/89.242490
Jianxiong Wu, Vishwa Gupta, Application of simultaneous decoding algorithms to automatic transcription of known and unknown words international conference on acoustics speech and signal processing. ,vol. 2, pp. 589- 592 ,(1999) , 10.1109/ICASSP.1999.759735