HMM-Based Distributed Text-to-Speech Synthesis Incorporating Speaker-Adaptive Training

作者: Kwang Myung Jeon , Seung Ho Choi

DOI: 10.14257/IJMUE.2014.9.5.10

关键词:

摘要: In this paper, a hidden Markov model (HMM) based distributed text-to-speech (TTS) system is proposed to synthesize the voices of various speakers in client-server framework. The on speaker-adaptive training for constructing HMMs corresponding target speaker, and its computational complexity balanced by distributing processing modules TTS at both client server achieve real-time operation. other words, fewer complex operations, such as text inputs HMM-based speech synthesis, are conducted client, while training, which very operation, assigned server. It shown from performance evaluation that operates real time provides good synthesized quality terms intelligibility similarity.

参考文章(23)
John Kominek, Alan W. Black, The CMU Arctic speech databases. SSW. pp. 223- 224 ,(2004)
Yannis Stylianou, Jean Laroche, Eric Moulines, High-quality speech modification based on a harmonic + noise model. conference of the international speech communication association. ,(1995)
Junichi Yamagishi, Takao Kobayashi, Makoto Tachibana, Yuji Nakano, Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis. conference of the international speech communication association. ,(2006)
Keiichi Tokuda, Alan W. Black, Heiga Zen, AN HMM-BASED SPEECH SYNTHESIS SYSTEM APPLIED TO ENGLISH ,(2003)
K. K. Paliwal, W. B. Kleijn, Speech Coding and Synthesis Elsevier Science Inc.. ,(1995)
Tomoki Toda, Heiga Zen, An Overview of Nitech HMM-based Speech Synthesis System for Blizzard Challenge 2005 conference of the international speech communication association. pp. 93- 96 ,(2005)
Keiichiro Oura, Junichi Yamagishi, Junichi Yamagishi, Tomoki Toda, Shinji Sako, Takashi Nose, Takashi Masuko, Takashi Masuko, Keiichi Tokuda, Alan W. Black, Heiga Zen, Heiga Zen, Recent development of the HMM-based speech synthesis system (HTS) asia pacific signal and information processing association annual summit and conference. pp. 121- 130 ,(2009)
K. Tokuda, T. Kobayashi, S. Imai, Speech parameter generation from HMM using dynamic features international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 660- 663 ,(1995) , 10.1109/ICASSP.1995.479684
Yoo Rhee OH, Hong Kook KIM, A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition IEICE Transactions on Information and Systems. ,vol. 93, pp. 2379- 2387 ,(2010) , 10.1587/TRANSINF.E93.D.2379