作者: K. Chen , M. Hasegawa-Johnson , A. Cohen , S. Borys , Sung-Suk Kim
关键词:
摘要: Does prosody help word recognition? This paper proposes a novel probabilistic framework in which and phoneme are dependent on way that reduces error rates (WER) relative to prosody-independent recognizer with comparable parameter count. In the proposed prosody-dependent speech recognizer, models conditioned two important prosodic variables: intonational phrase boundary pitch accent. An information-theoretic analysis is provided show acoustic language modeling can increase mutual information between true hypothesis observation by exciting interaction model model. Empirically, results indicate influence of these variables allophonic mainly restricted small subset distributions: duration PDFs (modeled using an explicit hidden Markov or EDHMM) acoustic-prosodic (normalized frequency). Influence cepstral features limited phonemes: for example, vowels may be influenced both accent position, but phrase-initial phrase-final consonants independent Leveraging results, effective built minimal These recognizers able reduce up 11% count, experiments based prosodically-transcribed Boston Radio News corpus.