A robust multimodal approach for emotion recognition

作者: Mingli Song , Mingyu You , Na Li , Chun Chen

DOI: 10.1016/J.NEUCOM.2007.07.041

关键词:

摘要: Emotion recognition is one of the latest challenges in intelligent human/computer communication. Most previous work on emotion focused extracting emotions from visual or audio information separately. A novel approach presented this paper, including both and video clips, to recognize human emotion. The Facial Animation Parameters (FAPs) compliant facial feature tracking based GASM (GPU Active Shape Model) performed generate two vector streams which represent expression speech one. To extract effective features, geodesic distance estimation, we develop an enhanced Lipschitz embedding embed high dimensional acoustic features into low space. Combined with vectors, extracted terms features. Then, a tripled Hidden Markov Model introduced perform allows state asynchrony observation sequences while preserving their natural correlation over time. experimental results show that outperforms conventional approaches for recognition.

参考文章(28)
C Chen, M Song, J Bu, N Li, Audio-visual based emotion recognition - a new approach computer vision and pattern recognition. ,vol. 2, pp. 1020- 1025 ,(2004) , 10.1109/CVPR.2004.1315276
Ara V. Nefian, Luhong Liang, Xiaobo Pi, Liu Xiaoxiang, Crusoe Mao, Kevin Murphy, A coupled HMM for audio-visual speech recognition IEEE International Conference on Acoustics Speech and Signal Processing. ,vol. 2, pp. 2013- 2016 ,(2002) , 10.1109/ICASSP.2002.5745027
Samy Bengio, An Asynchronous Hidden Markov Model for Audio-Visual Speech Recognition neural information processing systems. ,vol. 15, pp. 1237- 1244 ,(2002)
F. Lavagetto, R. Pockaj, The facial animation engine: toward a high-level interface for the design of MPEG-4 compliant animated faces IEEE Transactions on Circuits and Systems for Video Technology. ,vol. 9, pp. 277- 289 ,(1999) , 10.1109/76.752095
Roddy Cowie, Ellen Douglas-Cowie, Susie Savvidou*, Edelle McMahon, Martin Sawey, Marc Schröder, FEELTRACE: an instrument for recording perceived emotion in real time Speech and Emotion: Proceedings of the ISCA workshop. ,(2000)
Jiajun Bu, Mingli Song, Qi Wu, Chun Chen, Cheng Jin, Sketch Based Facial Expression Recognition Using Graphics Hardware Affective Computing and Intelligent Interaction. pp. 72- 79 ,(2005) , 10.1007/11573548_10
Valery A. Petrushin, EMOTION RECOGNITION IN SPEECH SIGNAL: EXPERIMENTAL STUDY, DEVELOPMENT, AND APPLICATION conference of the international speech communication association. pp. 222- 225 ,(2000)
Kenji Mase, Recognition of Facial Expression from Optical Flow IEICE Transactions on Information and Systems. ,vol. 74, pp. 3474- 3483 ,(1991)
F. Dellaert, T. Polzin, A. Waibel, Recognizing emotion in speech international conference on spoken language processing. ,vol. 3, pp. 1970- 1973 ,(1996) , 10.1109/ICSLP.1996.608022
Zhen Wen, Zhihong Zeng, Yuxiao Hu, Yun Fu, Thomas S. Huang, Glenn I. Roisman, Audio-visual emotion recognition in adult attachment interview Proceedings of the 8th international conference on Multimodal interfaces - ICMI '06. pp. 139- 145 ,(2006) , 10.1145/1180995.1181028