XIMERA: a new TTS from ATR based on corpus-based technologies.

作者: Minoru Tsuzaki , Tomoki Toda , Keiichi Tokuda , Hisashi Kawai , Jinfu Ni

DOI:

关键词: Selection (genetic algorithm)Corpus basedPerceptionNaturalnessComputer scienceHazard perceptionSpeech recognitionHidden Markov model

摘要: … As CHATR was originally designed to be a workbench for speech synthesis researches, it supported equivalent modules for each TTS process. For the waveform generation process, it …

参考文章(19)
Kazuo Hakoda, Tomohisa Hirokawa, Segment selection and pitch modification for high quality speech synthesis using waveform segments. conference of the international speech communication association. ,(1990)
Hisashi Kawai, Seiichi Yamamoto, Tohru Shimizu, Norio Higuchi, A design method of speech corpus for text-to-speech synthesis taking account of prosody. conference of the international speech communication association. pp. 420- 425 ,(2000)
Mark C. Beutnagel, Alistair Conkie, Ann K. Syrdal, Diphone synthesis using unit selection. SSW. pp. 185- 190 ,(1998)
Eric Chang, Min Chu, Hu Peng, Yu Shi, Power spectral density based channel equalization of large speech database for concatenative TTS system. conference of the international speech communication association. ,(2002)
Yoshinori Sagisaka, Nobuyoshi Kaiki, Naoto Iwahashi, Speech Segment Selection for Concatenative Synthesis Based on Spectral Distortion Minimization IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences. ,vol. 76, pp. 1942- 1948 ,(1993)
T. Toda, H. Kawai, M. Tsuzaki, K. Shikano, Segment selection considering local degradation of naturalness in concatenative speech synthesis international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 696- 699 ,(2003) , 10.1109/ICASSP.2003.1198876
Yoshinori Sagisaka, Nobuyoshi Kaiki, Naoto Iwahashi, Katsuhiko Mimura, ATR μ-talk speech synthesis system. conference of the international speech communication association. ,(1992)
K. Tokuda, T. Kobayashi, S. Imai, Speech parameter generation from HMM using dynamic features international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 660- 663 ,(1995) , 10.1109/ICASSP.1995.479684
Mark C. Beutnagel, Alistair D. Conkie, Juergen Schroeter, Yannis Stylianou, Ann K. Syrdal, The AT&T Next‐Gen TTS System The Journal of the Acoustical Society of America. ,vol. 105, pp. 1030- 1030 ,(1999) , 10.1121/1.424924