Construction and Analysis of Multiple Paths in Syllable Models

作者: L.F.M. ten Bosch , L.W.J. Boves , K.A. Hämäläinen

DOI:

关键词: Computer scienceVariation (linguistics)SyllableFeature vectorConstruct (philosophy)PronunciationPhoneSpeech recognition

摘要: In this paper, we construct multi-path syllable models using phonetic knowledge for initialising the parallel paths, and a data-driven solution their re-estimation. We hypothesise that richer topology of would be better at accounting pronunciation variation than context-dependent phone can only account effects left right neighbours. show paths are initialised with then re­ estimated do indeed result in different trajectories feature space. Yet, does not recognition performance. suggest explanations finding, provide reader important insights into issues playing role modelling models.

参考文章(11)
M. Ostendorf, Moving beyond the 'beads-on-a-string' model of speech Proc. of IEEE ASRU Workshop, Keystone, Co., 1999. ,(1999)
R. Harald Baayen, Wim Goedertier, Nelleke Oostdijk, Frank Van Eynde, Michael Moortgat, Louis Boves, Jean-Pierre Martens, Experiences from the Spoken Dutch Corpus Project language resources and evaluation. ,vol. 1, pp. 340- 347 ,(2002)
S. Kullback, R. A. Leibler, On Information and Sufficiency Annals of Mathematical Statistics. ,vol. 22, pp. 79- 86 ,(1951) , 10.1214/AOMS/1177729694
Li Deng, Dong Yu, A. Acero, Structured speech modeling IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 14, pp. 1492- 1504 ,(2006) , 10.1109/TASL.2006.878265
Annika Hamalainen, Louis ten Bosch, Lou Boves, Modelling Pronunciation Variation using Multi-Path HMMS for Syllables international conference on acoustics, speech, and signal processing. ,vol. 4, pp. 781- 784 ,(2007) , 10.1109/ICASSP.2007.367029
Judith M. Kessens, Catia Cucchiarini, Helmer Strik, A data-driven method for modeling pronunciation variation Speech Communication. ,vol. 40, pp. 517- 534 ,(2003) , 10.1016/S0167-6393(02)00150-4
J.M. de Veth, L.W.J. Boves, K.A. Hämäläinen, Syllable-Length Acoustic Units in Large-Vocabulary Continuous Speech Recognition international conference on speech and computer. pp. 499- 502 ,(2005)
A. Sethy, S. Narayanan, Split-lexicon based hierarchical recognition of speech using syllable and word level acoustic units international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 772- 775 ,(2003) , 10.1109/ICASSP.2003.1198895
A. Ganapathiraju, J. Hamaker, J. Picone, M. Ordowski, G.R. Doddington, Syllable-based large vocabulary continuous speech recognition IEEE Transactions on Speech and Audio Processing. ,vol. 9, pp. 358- 366 ,(2001) , 10.1109/89.917681