MGNC-CNN: A Simple Approach to Exploiting Multiple Word Embeddings for Sentence Classification

作者: Stephen Roller , Ye Zhang , Byron Wallace

DOI:

关键词: EmbeddingNorm (mathematics)Curse of dimensionalityFeature vectorArtificial intelligenceConvolutional neural networkComputer scienceSentencePattern recognition

摘要: We introduce a novel, simple convolution neural network (CNN) architecture - multi-group norm constraint CNN (MGNC-CNN) that capitalizes on multiple sets of word embeddings for sentence classification. MGNC-CNN extracts features from input embedding independently and then joins these at the penultimate layer in to form final feature vector. adopt group regularization strategy differentially penalizes weights associated with subcomponents generated respective sets. This model is much simpler than comparable alternative architectures requires substantially less training time. Furthermore, it flexible does not require be same dimensionality. show consistently outperforms baseline models.

参考文章(21)
Lu Zhang, Ge Li, Zhi Jin, Lili Mou, Hao Peng, Yan Xu, Discriminative Neural Sentence Modeling by Tree-Based Convolution arXiv: Computation and Language. ,(2015)
Rie Johnson, Tong Zhang, Effective Use of Word Order for Text Categorization with Convolutional Neural Networks arXiv: Computation and Language. ,(2014)
Yoav Goldberg, A Primer on Neural Network Models for Natural Language Processing arXiv: Computation and Language. ,(2015)
Bo Pang, Lillian Lee, A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts meeting of the association for computational linguistics. pp. 271- 278 ,(2004) , 10.3115/1218955.1218990
Ronan Collobert, Jason Weston, A unified architecture for natural language processing Proceedings of the 25th international conference on Machine learning - ICML '08. pp. 160- 167 ,(2008) , 10.1145/1390156.1390177
Nal Kalchbrenner, Edward Grefenstette, Phil Blunsom, A Convolutional Neural Network for Modelling Sentences Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 655- 665 ,(2014) , 10.3115/V1/P14-1062
Nam Khanh Tran, Elia Bruni, Marco Baroni, Gemma Boleda, Distributional Semantics in Technicolor meeting of the association for computational linguistics. pp. 136- 145 ,(2012)
Byron C. Wallace, Do Kook Choe, Laura Kertz, Eugene Charniak, Humans Require Context to Infer Ironic Intent (so Computers Probably do, too) Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). pp. 512- 516 ,(2014) , 10.3115/V1/P14-2084
Thierry Poibeau, Anna Korhonen, Tim Van de Cruys, Latent Vector Weighting for Word Meaning in Context empirical methods in natural language processing. pp. 1012- 1022 ,(2011)
Marco Baroni, Silvia Bernardini, Adriano Ferraresi, Eros Zanchetta, The WaCky wide web: a collection of very large linguistically processed web-crawled corpora language resources and evaluation. ,vol. 43, pp. 209- 226 ,(2009) , 10.1007/S10579-009-9081-4