Comparison of Text-Independent Original Speaker Recognition from Emotionally Converted Speech

作者: Jiří Přibil , Anna Přibilová

DOI: 10.1007/978-3-319-28109-4_14

关键词:

摘要: The paper describes an application of the classifier based on Gaussian mixture models (GMM) for reverse identification original speaker from emotionally transformed speech in Czech and Slovak. We investigate whether score given by GMM depends type structure used features. Comparison results obtained with sentences German English has shown that balance database have influence accuracy but language is not practically important. evaluation experiments confirmed developed text-independent identifier functional closed-set classification tasks.

参考文章(32)
Maria Teresa Riviello, Mohamed Chetouani, David Cohen, Anna Esposito, On the Perception of Emotional “Voices”: A Cross-Cultural Comparison among American, French and Italian Subjects Lecture Notes in Computer Science. pp. 368- 377 ,(2011) , 10.1007/978-3-642-25775-9_34
László Tóth, Tamás Grósz, None, A Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition text speech and dialogue. pp. 36- 43 ,(2013) , 10.1007/978-3-642-40585-3_6
Jiří Přibil, Anna Přibilová, Application of Expressive Speech in TTS System with Cepstral Description Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction. pp. 200- 212 ,(2008) , 10.1007/978-3-540-70872-8_15
Walter F. Sendlmeier, Astrid Paeschke, Felix Burkhardt, Benjamin Weiss, M. Rolfes, A database of German emotional speech. conference of the international speech communication association. pp. 1517- 1520 ,(2005)
Jon Aaron Alcantara, Louie Patrice Lu, John Kynneth Magno, Zhayne Soriano, Ethel Ong, Ron Resurreccion, Emotional Narration of Children’s Stories Proceedings in Information and Communications Technology. pp. 1- 14 ,(2012) , 10.1007/978-4-431-54106-6_1
Anna Přibilová, Jiří Přibil, Harmonic model for female voice emotional synthesis BioID_MultiComm'09 Proceedings of the 2009 joint COST 2101 and 2102 international conference on Biometric ID management and multimodal communication. pp. 41- 48 ,(2009) , 10.1007/978-3-642-04391-8_6
Jiří Přibil, Anna Přibilová, Jindřich Matoušek, GMM classification of text-to-speech synthesis: identification of original speaker’s voice text speech and dialogue. ,vol. 8655, pp. 365- 373 ,(2014) , 10.1007/978-3-319-10816-2_44
Piotr Staroniewicz, Wojciech Majewski, SVM based text-dependent speaker identification for large set of voices european signal processing conference. pp. 333- 336 ,(2004)
Carla Lopes, Fernando Perdigao, Phoneme Recognition on the TIMIT Database InTech. ,(2011) , 10.5772/17600