Recognize basic emotional statesin speech by machine learning techniques using mel-frequency cepstral coefficient features

作者: Ningning Yang , Nilanjan Dey , R. Simon Sherratt , Fuqian Shi

DOI: 10.3233/JIFS-179963

关键词:

摘要: Speech Emotion Recognition (SER) has been widely used in many fields, such as smart home assistants commonly found the market. Smart that could detect user’s emotion would improve communication between a user and assistant enabling to offer more productive feedback. Thus, aim of this work is analyze emotional states speech propose suitable algorithm considering performance verses complexity for deployment devices. The four sets were selected from Berlin Emotional Database (EMO-DB) experimental data, 26 MFCC features extracted each type identify emotions happiness, anger, sadness neutrality. Then, speaker-independent experiments our conducted by using Back Propagation Neural Network (BPNN), Extreme Learning Machine (ELM), Probabilistic (PNN) Support Vector (SVM). Synthesizing recognition accuracy processing time, shows SVM was best among methods good candidate be deployed SER achieved an overall 92.4% while offering low computational requirements when training testing. We conclude classification models are highly effective automatic prediction emotion.

参考文章(55)
Baixi Xing, Kejun Zhang, Shouqian Sun, Lekai Zhang, Zenggui Gao, Jiaxi Wang, Shi Chen, Emotion-driven Chinese folk music-image retrieval based on DE-SVM Neurocomputing. ,vol. 148, pp. 619- 627 ,(2015) , 10.1016/J.NEUCOM.2014.08.007
Maria-Christina Laiou, Alessandro Astolfi, Transformations of Nonlinear Systems to High-Order Generalized Chained Forms Journal of Dynamic Systems Measurement and Control-transactions of The Asme. ,vol. 127, pp. 729- 733 ,(2005) , 10.1115/1.1898234
Weihui Dai, Dongmei Han, Yonghui Dai, Dongrong Xu, Emotion recognition and affective computing on vocal social media Information & Management. ,vol. 52, pp. 777- 788 ,(2015) , 10.1016/J.IM.2015.02.003
Rodrigo Capobianco Guido, José Carlos Pereira, Jan Frans Willem Slaets, Emergent artificial intelligence approaches for pattern recognition in speech and language processing Computer Speech & Language. ,vol. 24, pp. 431- 432 ,(2010) , 10.1016/J.CSL.2010.03.002
Dong Keun Kim, Sangmin Ahn, Sangin Park, Mincheol Whang, Interactive emotional lighting system using physiological signals IEEE Transactions on Consumer Electronics. ,vol. 59, pp. 765- 771 ,(2013) , 10.1109/TCE.2013.6689687
Kyoungro Yoon, Jonghyung Lee, Min-Uk Kim, Music recommendation system using emotion triggering low-level features IEEE Transactions on Consumer Electronics. ,vol. 58, pp. 612- 618 ,(2012) , 10.1109/TCE.2012.6227467
Xuemei Dong, Ding-Xuan Zhou, Learning gradients by a gradient descent algorithm Journal of Mathematical Analysis and Applications. ,vol. 341, pp. 1018- 1027 ,(2008) , 10.1016/J.JMAA.2007.10.044
Moataz El Ayadi, Mohamed S. Kamel, Fakhri Karray, Survey on speech emotion recognition: Features, classification schemes, and databases Pattern Recognition. ,vol. 44, pp. 572- 587 ,(2011) , 10.1016/J.PATCOG.2010.09.020
R.L Rosa, Demostenes Z Rodriguez, Graca Bressan, Music recommendation system based on user's sentiments extracted from social networks international conference on consumer electronics. ,vol. 61, pp. 359- 367 ,(2015) , 10.1109/TCE.2015.7298296