Deep learning for robust feature generation in audiovisual emotion recognition

作者： Yelin Kim , Honglak Lee , Emily Mower Provost

DOI: 10.1109/ICASSP.2013.6638346

关键词: SIGNAL (programming language) 、 Emotion classification 、 Speech recognition 、 Focus (optics) 、 Feature selection 、 Artificial intelligence 、 Feature (machine learning) 、 Deep belief network 、 Computer science 、 Machine learning 、 Deep learning

摘要: … features for audio-visual emotion recognition. Emotion recognition accuracy relies heavily on the ability to generate representative features. However, this is a very challenging problem. …

参考文章(37)

Włodzisław Duch, Jacek Biesiada, Tomasz Winiarski, Karol Grudziński, Krzysztof Grąbczewski, Feature Selection Based on Information Theory Filters Physica, Heidelberg. pp. 173- 178 ,(2003) , 10.1007/978-3-7908-1902-1_23

Carlos Busso, Sungbok Lee, Shrikanth S. Narayanan, Using Neutral Speech Models for Emotional Speech Analysis conference of the international speech communication association. pp. 2225- 2228 ,(2007)

Tim Polzehl, Hamed Ketabdar, Michael Wagner, Florian Metze, Shiva Sundaram, Emotion Classification in Children's speech using fusion of acoustic and linguistic features conference of the international speech communication association. pp. 340- 343 ,(2009)

Alessandro Vinciarelli, Elmar Nöth, Rob van Son, Björn W. Schuller, Stefan Steidl, Felix Burkhardt, Benjamin Weiss, Tobias Bocklet, Florian Eyben, Felix Weninger, Gelareh Mohammadi, Anton Batliner, The INTERSPEECH 2012 Speaker Trait Challenge conference of the international speech communication association. pp. 254- 257 ,(2012)

Björn W. Schuller, Stefan Steidl, Anton Batliner, The INTERSPEECH 2009 Emotion Challenge conference of the international speech communication association. pp. 312- 315 ,(2009)

P. Smolensky, Information processing in dynamical systems: foundations of harmony theory Parallel distributed processing: explorations in the microstructure of cognition, vol. 1. pp. 194- 281 ,(1986)

Gerhard Rigoll, Bernd Radig, Dejan Arsic, Björn W. Schuller, Matthias Wimmer, Low-Level Fusion of Audio and Video Feature for Multi-Modal Emotion Recognition international conference on computer vision theory and applications. pp. 145- 151 ,(2008)

Chris Eliasmith, Yichuan Tang, Deep networks for robust visual recognition international conference on machine learning. pp. 1055- 1062 ,(2010)

Honglak Lee, Roger Grosse, Rajesh Ranganath, Andrew Y. Ng, Unsupervised learning of hierarchical representations with convolutional deep belief networks Communications of the ACM. ,vol. 54, pp. 95- 103 ,(2011) , 10.1145/2001269.2001295

10.

Abdel-rahman Mohamed, George E. Dahl, Geoffrey Hinton, Acoustic Modeling Using Deep Belief Networks IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 20, pp. 14- 22 ,(2012) , 10.1109/TASL.2011.2109382

Deep learning for robust feature generation in audiovisual emotion recognition

来源期刊

我的账户

Deep learning for robust feature generation in audiovisual emotion recognition

来源期刊

相似文章 10

我的账户