Context-Sensitive Learning for Enhanced Audiovisual Emotion Classification

作者: A. Metallinou , M. Wollmer , A. Katsamanis , F. Eyben , B. Schuller

DOI: 10.1109/T-AFFC.2011.40

关键词:

摘要: Human emotional expression tends to evolve in a structured manner the sense that certain evolution patterns, i.e., anger anger, are more probable than others, e.g., happiness. Furthermore, perception of an display can be affected by recent displays. Therefore, content past and future observations could offer relevant temporal context when classifying observation. In this work, we focus on audio-visual recognition improvised interactions at utterance level. We examine context-sensitive schemes for emotion within multimodal, hierarchical approach: bidirectional Long Short-Term Memory (BLSTM) neural networks, Hidden Markov Model classifiers (HMMs), hybrid HMM/BLSTM considered modeling between utterances over course dialog. Overall, our experimental results indicate incorporating long-term is beneficial systems encounter variety manifestations. Context-sensitive approaches outperform those without classification tasks such as discrimination valence levels or clusters valence-activation space. The analysis transitions database sheds light into flow affective expressions, revealing potentially useful patterns.

参考文章(47)
Ira Cohen, Thomas S. Huang, Ashutosh Garg, Emotion Recognition from Facial Expressions using Multilevel HMM ,(2000)
Chalapathy Neti, Guillaume Gravier, Gerasimos Potamianos, Asynchrony modeling for audio-visual speech recognition international conference on human language technology research. pp. 1- 6 ,(2002)
M. Hall, Correlation-based Feature Selection for Machine Learning PhD Thesis, Waikato Univer-sity. ,(1998)
Gordon McIntyre, Roland Göcke, Towards affective sensing international conference on human computer interaction. pp. 411- 420 ,(2007) , 10.1007/978-3-540-73110-8_44
Ellen Douglas-Cowie, Martin Wöllmer, Roddy Cowie, Björn W. Schuller, Florian Eyben, Data-driven Clustering in Emotional Space for Affect Recognition Using Discriminatively Trained LSTM Networks conference of the international speech communication association. pp. 1595- 1598 ,(2009)
Tanja Bänziger, Klaus R. Scherer, Using Actor Portrayals to Systematically Study Multimodal Emotion Expression: The GEMEP Corpus affective computing and intelligent interaction. pp. 476- 487 ,(2007) , 10.1007/978-3-540-74889-2_42
Jürgen Schmidhuber, Alex Graves, Santiago Fernández, Bidirectional LSTM networks for improved phoneme classification and recognition international conference on artificial neural networks. pp. 799- 804 ,(2005) , 10.1007/11550907_126
Mark A. Hall, Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques ,(1999)
Christopher D. Manning, Hinrich Schütze, Foundations of Statistical Natural Language Processing ,(1999)