Voice Modification, Synthetic

作者: J. Schroeter

DOI: 10.1016/B0-08-044854-2/00916-0

关键词:

摘要: A significant part of the work required to create a high-quality speech synthesizer is creation ‘synthetic voices.’ Reusing an existing voice database and making it sound like different speaker, or same speaker in emotional state, using speaking style, obviously important for increasing efficiency creating options synthesizer. This article reviews techniques change signal characteristics pitch durations also spectral modifications. We conclude by assessing prospects modification synthesis light now-available advanced machine learning techniques.

参考文章(23)
Jan P. H. van Santen, Combinatorial issues in text-to-speech synthesis. conference of the international speech communication association. ,(1997)
John F. Pitrelli, Janet B. Pierrehumbert, Julia Hirschberg, Colin W. Wightman, Mary E. Beckman, Mari Ostendorf, Patti Price, Kim E. A. Silverman, TOBI: a standard for labeling English prosody. conference of the international speech communication association. ,(1992)
J. Hirschberg, Speech Synthesis: Prosody Encyclopedia of Language & Linguistics (Second Edition). pp. 49- 55 ,(2006) , 10.1016/B0-08-044854-2/00914-7
Tanja Klankert, Norbert Braunschweiler, Bettina Säuberlich, Bernd Möbius, Antje Schweitzer, Restricted unlimited domain synthesis. conference of the international speech communication association. ,(2003)
A.K. Syrdal, Acoustic variability in spontaneous conversational speech of American English talkers international conference on spoken language processing. ,vol. 1, pp. 438- 441 ,(1996) , 10.1109/ICSLP.1996.607148
A. Kain, Y. Stylianou, Stochastic modeling of spectral adjustment for high quality pitch modification international conference on acoustics, speech, and signal processing. ,vol. 2, pp. 949- 952 ,(2000) , 10.1109/ICASSP.2000.859118
D. Pisoni, R. Bernacki, H. Nusbaum, M. Yuchtman, Some acoustic-phonetic correlates of speech produced in noise international conference on acoustics, speech, and signal processing. ,vol. 10, pp. 1581- 1584 ,(1985) , 10.1109/ICASSP.1985.1168217
Yoshinori Sagisaka, Nobuyoshi Kaiki, Naoto Iwahashi, Katsuhiko Mimura, ATR μ-talk speech synthesis system. conference of the international speech communication association. ,(1992)
Alistair Conkie, A robust unit selection system for speech synthesis Journal of the Acoustical Society of America. ,vol. 105, pp. 978- 978 ,(1999) , 10.1121/1.425343
Masanobu Abe, Satoshi Nakamura, Kiyohiro Shikano, Hisao Kuwabara, Voice conversion through vector quantization. Journal of the Acoustical Society of Japan (E). ,vol. 11, pp. 71- 76 ,(1990) , 10.1250/AST.11.71