Unsupervised joint prosody labeling and modeling for Mandarin speech.

作者： Chen-Yu Chiang , Sin-Horng Chen , Hsiu-Min Yu , Yih-Ru Wang

关键词: Syllable 、 Speech recognition 、 Feature (machine learning) 、 Mandarin Chinese 、 Variation (linguistics) 、 Nonverbal communication 、 Prosody 、 Natural language 、 Juncture 、 Linguistics 、 Computer science 、 Discriminative model

摘要: An unsupervised joint prosody labeling and modeling method for Mandarin speech is proposed, a new scheme intended to construct statistical prosodic models label tags consistently speech. Two types of are determined by four designed illustrate the hierarchy prosody: break syllable juncture demarcate constituents state represent any domain’s pitch-level variation resulting from its upper-layered constituents’ influences. The performance proposed was evaluated using an unlabeled read-speech corpus articulated experienced female announcer. Experimental results showed that estimated parameters were able explore describe structures patterns prosody. Besides, certain corresponding relationships between indices labeled associated words found, manifested connections linguistic parameters, finding further verifying capability presented. Finally, quantitative comparison in human labelers indicated former more consistent discriminative than latter feature distributions, merit developed here on applications modeling.

参考文章(74)

Wang Bei, Yang Yufang, Acoustic Correlates of Hierarchical Prosodic Boundary in Mandarin ,(2002)

Xu Bo, Shen Xipeng, A CART_Based Hierarchical Stochastic Model for Prosodic Phrasing in Chinese 1 ,(2000)

Mark Hasegawa-Johnson, Ken Chen, How Prosody Improves Word Recognition ,(2004)

Zhigang Yin, Wu Hua, Xiaoxia Chen, Jingzhu Yan, Guohua Sun, Maocan Lin, Yiqing Zu, Aijun Li, Speech corpus of Chinese discourse and the phonetic research. conference of the international speech communication association. pp. 13- 18 ,(2000)

John F. Pitrelli, Janet B. Pierrehumbert, Julia Hirschberg, Colin W. Wightman, Mary E. Beckman, Mari Ostendorf, Patti Price, Kim E. A. Silverman, TOBI: a standard for labeling English prosody. conference of the international speech communication association. ,(1992)

Steve Renals, Yoshihiko Gotoh, Sentence Boundary Detection in Broadcast Speech Transcripts ASR2000 - Automatic Speech Recognition: Challenges for the new Millenium. pp. 228- 235 ,(2000)

Paul Taylor, The Tilt Intonation Model conference of the international speech communication association. ,(1998)

Ren-Hua Wang, Jian-Feng Li, Guoping Hu, Chinese prosody phrase break prediction based on maximum entropy model. conference of the international speech communication association. ,(2004)

Philip C. Woodland, Ji-Hwan Kim, The use of prosody in a combined system for punctuation generation and speech recognition conference of the international speech communication association. pp. 2757- 2760 ,(2001)

10.

Che-Kuang Lin, Lin-Shan Lee, Improved spontaneous Mandarin speech recognition by disfluency interruption point (IP) detection using prosodic features. conference of the international speech communication association. pp. 1621- 1624 ,(2005)

Unsupervised joint prosody labeling and modeling for Mandarin speech.

来源期刊

我的账户

Unsupervised joint prosody labeling and modeling for Mandarin speech.

来源期刊

相似文章 10

我的账户