作者: Soujanya Poria , Louis-Philippe Morency , Amir Zadeh , Minghai Chen , Erik Cambria
DOI:
关键词: Speech recognition 、 Artificial intelligence 、 Computer science 、 Spoken language 、 Gesture 、 Natural language processing 、 Sentiment analysis 、 Tensor (intrinsic definition) 、 Dynamics (music)
摘要: Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of to a multimodal setup where other relevant modalities accompany language. In this paper, we pose problem as modeling intra-modality and inter-modality dynamics. We introduce novel model, termed Tensor Fusion Network, learns both such dynamics end-to-end. The proposed approach tailored for volatile nature spoken language in online videos well accompanying gestures voice. experiments, our model outperforms state-of-the-art approaches unimodal analysis.