Temporal Attention-Gated Model for Robust Sequence Classification

作者: Wenjie Pei , Tadas Baltrusaitis , David M. J. Tax , Louis-Philippe Morency

DOI: 10.1109/CVPR.2017.94

关键词:

摘要: Typical techniques for sequence classification are designed well-segmented sequences which have been edited to remove noisy or irrelevant parts. Therefore, such methods cannot be easily applied on expected in real-world applications. In this paper, we present the Temporal Attention-Gated Model (TAGM) integrates ideas from attention models and gated recurrent networks better deal with unsegmented sequences. Specifically, extend concept of model measure relevance each observation (time step) a sequence. We then use novel network learn hidden representation final prediction. An important advantage our approach is interpretability since temporal weights provide meaningful value salience time step demonstrate merits TAGM approach, both prediction accuracy interpretability, three different tasks: spoken digit recognition, text-based sentiment analysis visual event recognition.

参考文章(43)
Richard Socher, Andrew Y. Ng, Eric H. Huang, Christopher D. Manning, Jeffrey Pennington, Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions empirical methods in natural language processing. pp. 151- 161 ,(2011)
Yu-Gang Jiang, Qi Dai, Tao Mei, Yong Rui, Shih-Fu Chang, None, Super Fast Event Recognition in Internet Videos IEEE Transactions on Multimedia. ,vol. 17, pp. 1174- 1186 ,(2015) , 10.1109/TMM.2015.2436813
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, Yoshua Bengio, None, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention international conference on machine learning. ,vol. 3, pp. 2048- 2057 ,(2015)
Li Yao, Atousa Torabi, Kyunghyun Cho, Nicolas Ballas, Christopher Pal, Hugo Larochelle, Aaron Courville, Describing Videos by Exploiting Temporal Structure 2015 IEEE International Conference on Computer Vision (ICCV). pp. 4507- 4515 ,(2015) , 10.1109/ICCV.2015.512
Alex Waibel, Kai-Fu Lee, Readings in speech recognition Morgan Kaufmann Publishers Inc.. ,(1990)
Geoffrey E. Hinton, Vinod Nair, Rectified Linear Units Improve Restricted Boltzmann Machines international conference on machine learning. pp. 807- 814 ,(2010)
Alex Graves, Generating Sequences With Recurrent Neural Networks arXiv: Neural and Evolutionary Computing. ,(2013)
Yoon Kim, Convolutional Neural Networks for Sentence Classification empirical methods in natural language processing. pp. 1746- 1751 ,(2014) , 10.3115/V1/D14-1181
Xinlei Chen, C. Lawrence Zitnick, Mind's eye: A recurrent visual representation for image caption generation computer vision and pattern recognition. pp. 2422- 2431 ,(2015) , 10.1109/CVPR.2015.7298856
Thang Luong, Hieu Pham, Christopher D. Manning, Effective Approaches to Attention-based Neural Machine Translation empirical methods in natural language processing. pp. 1412- 1421 ,(2015) , 10.18653/V1/D15-1166