Application of Expressive Speech in TTS System with Cepstral Description

作者: Jiří Přibil , Anna Přibilová

DOI: 10.1007/978-3-540-70872-8_15

关键词:

摘要: Expressive speech synthesis representing different human emotions has been in the interests of researchers for a longer time. Recently, some experiments with storytelling speaking style have performed. This particular is suitable applications aimed at children as well special blind people. Analyzing storytellers' speech, we designed set prosodic parameters prototypes converting produced by text-to-speech (TTS) system into speech. In addition to suprasegmental characteristics (pitch, intensity, and duration) included these prototypes, also information about significant frequencies spectral envelope flatness determining degree voicing was used.

参考文章(13)
Kjell Gustafson, Linda Bell, David House, Linn Johansson, Child-directed speech synthesis: evaluation of prosodic variation for an educational computer program. conference of the international speech communication association. ,(1999)
Jiří Přibil, Anna Přibilová, Emotional style conversion in the TTS system with cepstral description COST 2102'07 Proceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours. pp. 65- 73 ,(2007) , 10.1007/978-3-540-76442-7_6
Anna Esposito, Vojtěch Stejskal, Zdeněk Smékal, Nikolaos Bourbakis, The significance of empty speech pauses: cognitive and algorithmic issues BVAI'07 Proceedings of the 2nd international conference on Advances in brain, vision and artificial intelligence. pp. 542- 554 ,(2007) , 10.1007/978-3-540-75555-5_52
Olatunji O. Akande, Peter J. Murphy, Estimation of the vocal tract transfer function with application to glottal wave analysis Speech Communication. ,vol. 46, pp. 15- 36 ,(2005) , 10.1016/J.SPECOM.2005.01.007
M. Unser, Splines: a perfect fit for signal and image processing IEEE Signal Processing Magazine. ,vol. 16, pp. 22- 38 ,(1999) , 10.1109/79.799930
Anna Přibilová, Jiří Přibil, Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description non linear speech processing. ,vol. 48, pp. 1691- 1703 ,(2006) , 10.1016/J.SPECOM.2006.08.001
Taisuke Ito, Kazuya Takeda, Fumitada Itakura, Analysis and recognition of whispered speech Speech Communication. ,vol. 45, pp. 139- 152 ,(2005) , 10.1016/J.SPECOM.2003.10.005
A. Gray, J. Markel, A spectral-flatness measure for studying the autocorrelation method of linear prediction of speech analysis IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 22, pp. 207- 217 ,(1974) , 10.1109/TASSP.1974.1162572
Iree Cas, Anna Madlová, Kre Fei Stu, TWO SYNTHESIS METHODS BASED ON CEPSTRAL PARAMETERIZATION ,(2002)
Akemi Iida, Nick Campbell, Fumito Higuchi, Michiaki Yasumura, A corpus-based speech synthesis system with emotion Speech Communication. ,vol. 40, pp. 161- 187 ,(2003) , 10.1016/S0167-6393(02)00081-X