Exploiting Prosodic Breaks in Language Modeling with Random Forests

作者: Frederick Jelinek , Yi Su

DOI:

关键词:

摘要: We propose a novel method of exploiting prosodic breaks in language modeling for automatic speech recognition (ASR) based on the random forest model (RFLM), which is collection randomized decision tree models and can potentially ask any questions about history order to predict future. demonstrate how be easily incorporated into RFLM present two treat as observable hidden variables, respectively. Meanwhile, we show empirically that finer grained break needed modeling. Experimental results showed given breaks, were able reduce LM perplexity by significant margin, suggesting N -best rescoring approach ASR.

参考文章(25)
Andreas Stolcke, Dilek Zeynep Hakkani, Madelaine Plauché, Elizabeth Shriberg, Mari Ostendorf, Rebecca A. Bates, Gökhan Tür, Yu Lu, Automatic detection of sentence boundaries and disfluencies based on recognized words. conference of the international speech communication association. ,(1998)
Christine H. Nakatani, Julia Hirschberg, Acoustic indicators of topic segmentation. conference of the international speech communication association. ,(1998)
John F. Pitrelli, Janet B. Pierrehumbert, Julia Hirschberg, Colin W. Wightman, Mary E. Beckman, Mari Ostendorf, Patti Price, Kim E. A. Silverman, TOBI: a standard for labeling English prosody. conference of the international speech communication association. ,(1992)
Sanjeev Khudanpur, Frederick Jelinek, Yi Su, Large-scale random forest language models for speech recognition. conference of the international speech communication association. pp. 598- 601 ,(2007)
Frederick Jelinek, Peng Xu, Random Forests in Language Modelin empirical methods in natural language processing. pp. 325- 332 ,(2004)
Lidia Mangu, Peng Xu, Using random forest language models in the IBM RT-04 CTS system. conference of the international speech communication association. pp. 741- 744 ,(2005)
Jennifer Cole, Sarah Borys, Mark Hasegawa-Johnson, Ken Chen, Prosody dependent speech recognition with explicit duration modelling at intonational phrase boundaries conference of the international speech communication association. pp. 393- 396 ,(2003)
Nelson Morgan, John Eric Fosler-Lussier, Dynamic pronunciation models for automatic speech recognition University of California, Berkeley. ,(1999)
Andreas Stolcke, Elizabeth Shriberg, Dilek Z. Hakkani-Tür, Gökhan Tür, Modeling the prosody of hidden events for improved word recognition. conference of the international speech communication association. ,(1999)