Prosody dependent speech recognition on radio news corpus of American English

作者： K. Chen , M. Hasegawa-Johnson , A. Cohen , S. Borys , Sung-Suk Kim

DOI: 10.1109/TSA.2005.853208

关键词:

摘要: Does prosody help word recognition? This paper proposes a novel probabilistic framework in which and phoneme are dependent on way that reduces error rates (WER) relative to prosody-independent recognizer with comparable parameter count. In the proposed prosody-dependent speech recognizer, models conditioned two important prosodic variables: intonational phrase boundary pitch accent. An information-theoretic analysis is provided show acoustic language modeling can increase mutual information between true hypothesis observation by exciting interaction model model. Empirically, results indicate influence of these variables allophonic mainly restricted small subset distributions: duration PDFs (modeled using an explicit hidden Markov or EDHMM) acoustic-prosodic (normalized frequency). Influence cepstral features limited phonemes: for example, vowels may be influenced both accent position, but phrase-initial phrase-final consonants independent Leveraging results, effective built minimal These recognizers able reduce up 11% count, experiments based prosodically-transcribed Boston Radio News corpus.

illinois.edu PDF 下载加速

doi.org LINK 下载加速

ieee.org LINK 下载加速

illinois.edu LINK 下载加速

sci-hub.se PDF 下载加速

参考文章(32)

Mari Ostendorf, Richard Wright, Izhak Shafran, Prosody and phonetic variability: Lessons learned from acoustic model clustering ,(2003)

T. Zeppenfeld, E. Shriberg, M. Ostendorf, M. Finke, S. Roweis, A. Waibel, A. Gunawardana, K. Ross, M. Bacchiani, B. Wheatley, D. Talkin, B. Byrne, Modeling Systematic Variations in Pronunciation via a Language-Dependent Hidden Speaking Mode ,(1999)

John F. Pitrelli, Julia Hirschberg, Mary E. Beckman, Evaluation of prosodic transcription labeling reliability in the tobi framework. conference of the international speech communication association. ,(1994)

Philip C. Woodland, Ji-Hwan Kim, The use of prosody in a combined system for punctuation generation and speech recognition conference of the international speech communication association. pp. 2757- 2760 ,(2001)

Mary E. Beckman, Jan Edwards, Papers in Laboratory Phonology: Lengthenings and shortenings and the nature of prosodic constituency ,(1990) , 10.1017/CBO9780511627736.009

Mark Hasegawa-Johnson, Ken Chen, Aaron Cohen, A Maximum Likelihood Prosody Recognizer ,(2004)

Taehong Cho, The Effects of Prosody on Articulation in English ,(2002)

Mitchel Weintraub, Elizabeth Shriberg, Larry P. Heck, M. Kemal Sönmez, Modeling dynamic prosodic variation for speaker verification. conference of the international speech communication association. ,(1998)

Y. Normandin, Optimal splitting of HMM Gaussian mixture components with MMIE training international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 449- 452 ,(1995) , 10.1109/ICASSP.1995.479625

10.

J. G. Carbonell, Ralf Kompe, J. Siekmann, Prosody in Speech Understanding Systems ,(1997)

Prosody dependent speech recognition on radio news corpus of American English

来源期刊

我的账户

Prosody dependent speech recognition on radio news corpus of American English

来源期刊

相似文章 10

我的账户