Tensor Fusion Network for Multimodal Sentiment Analysis

作者： Soujanya Poria , Louis-Philippe Morency , Amir Zadeh , Minghai Chen , Erik Cambria

DOI:

关键词: Speech recognition 、 Artificial intelligence 、 Computer science 、 Spoken language 、 Gesture 、 Natural language processing 、 Sentiment analysis 、 Tensor (intrinsic definition) 、 Dynamics (music)

摘要: Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of to a multimodal setup where other relevant modalities accompany language. In this paper, we pose problem as modeling intra-modality and inter-modality dynamics. We introduce novel model, termed Tensor Fusion Network, learns both such dynamics end-to-end. The proposed approach tailored for volatile nature spoken language in online videos well accompanying gestures voice. experiments, our model outperforms state-of-the-art approaches unimodal analysis.

参考文章(41)

Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)

Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Trevor Darrell, Kate Saenko, Long-term recurrent convolutional networks for visual recognition and description computer vision and pattern recognition. pp. 2625- 2634 ,(2015) , 10.1109/CVPR.2015.7298878

Thomas Drugman, Mark Thomas, Jon Gudnason, Patrick Naylor, Thierry Dutoit, Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 20, pp. 994- 1006 ,(2012) , 10.1109/TASL.2011.2170835

Ingo R. Titze, Johan Sundberg, Vocal intensity in speakers and singers Journal of the Acoustical Society of America. ,vol. 91, pp. 2936- 2946 ,(1991) , 10.1121/1.402929

D. G. Childers, C. K. Lee, Vocal quality factors: analysis, synthesis, and perception. Journal of the Acoustical Society of America. ,vol. 90, pp. 2394- 2410 ,(1991) , 10.1121/1.402044

Sepp Hochreiter, Jürgen Schmidhuber, Long short-term memory Neural Computation. ,vol. 9, pp. 1735- 1780 ,(1997) , 10.1162/NECO.1997.9.8.1735

Paul Ekman, Wallace V. Freisen, Sonia Ancoli, Facial signs of emotional experience. Journal of Personality and Social Psychology. ,vol. 39, pp. 1125- 1134 ,(1980) , 10.1037/H0077722

Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, Manfred Stede, Lexicon-based methods for sentiment analysis Computational Linguistics. ,vol. 37, pp. 267- 307 ,(2011) , 10.1162/COLI_A_00049

Gilles Degottex, John Kane, Thomas Drugman, Tuomo Raitio, Stefan Scherer, COVAREP — A collaborative voice analysis repository for speech technologies international conference on acoustics, speech, and signal processing. pp. 960- 964 ,(2014) , 10.1109/ICASSP.2014.6853739

10.

Lillian Lee, Bo Pang, Opinion Mining and Sentiment Analysis ,(2008)

Tensor Fusion Network for Multimodal Sentiment Analysis

来源期刊

我的账户

Tensor Fusion Network for Multimodal Sentiment Analysis

来源期刊

相似文章 10

我的账户