Hierarchical Attention Networks for Document Classification

作者: Zichao Yang , Diyi Yang , Chris Dyer , Xiaodong He , Alex Smola

DOI: 10.18653/V1/N16-1174

关键词:

摘要: We propose a hierarchical attention network for document classification. Our model has two distinctive characteristics: (i) it structure that mirrors the of documents; (ii) levels mechanisms applied at wordand sentence-level, enabling to attend differentially more and less important content when constructing representation. Experiments conducted on six large scale text classification tasks demonstrate proposed architecture outperform previous methods by substantial margin. Visualization layers illustrates selects qualitatively informative words sentences.

参考文章(35)
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, Yoshua Bengio, None, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention international conference on machine learning. ,vol. 3, pp. 2048- 2057 ,(2015)
Edward Grefenstette, Phil Blunsom, Karl Moritz Hermann, Tomáš Kočiský, Will Kay, Lasse Espeholt, Mustafa Suleyman, Teaching machines to read and comprehend neural information processing systems. ,vol. 28, pp. 1693- 1701 ,(2015)
Mehran Sahami, Susan Dumais, Eric Horvitz, David Heckerman, A Bayesian Approach to Filtering Junk E-Mail national conference on artificial intelligence. ,(1998)
Yoon Kim, Convolutional Neural Networks for Sentence Classification empirical methods in natural language processing. pp. 1746- 1751 ,(2014) , 10.3115/V1/D14-1181
Wang Ling, Chris Dyer, Alan W Black, Isabel Trancoso, Ramon Fermandez, Silvio Amir, Luis Marujo, Tiago Luis, Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation empirical methods in natural language processing. pp. 1520- 1530 ,(2015) , 10.18653/V1/D15-1176
Sepp Hochreiter, Jürgen Schmidhuber, Long short-term memory Neural Computation. ,vol. 9, pp. 1735- 1780 ,(1997) , 10.1162/NECO.1997.9.8.1735
Lillian Lee, Bo Pang, Opinion Mining and Sentiment Analysis ,(2008)
Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition Proceedings of the IEEE. ,vol. 86, pp. 2278- 2324 ,(1998) , 10.1109/5.726791
Andrew Y. Ng, Christopher Potts, Andrew L. Maas, Dan Huang, Peter T. Pham, Raymond E. Daly, Learning Word Vectors for Sentiment Analysis meeting of the association for computational linguistics. pp. 142- 150 ,(2011)
Nal Kalchbrenner, Edward Grefenstette, Phil Blunsom, A Convolutional Neural Network for Modelling Sentences Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 655- 665 ,(2014) , 10.3115/V1/P14-1062