Word Embeddings for Multi-label Document Classification.

作者: Ladislav Lenc , , Pavel Král

DOI: 10.26615/978-954-452-049-6_057

关键词:

摘要: In this paper, we analyze and evaluate word embeddings for representation of longer texts in the multi-label classification scenario. The are used three convolutional neural network topologies. experiments realized on Czech CTK English Reuters-21578 standard corpora. We compare results word2vec static trainable with randomly initialized vectors. conclude that initialization does not play an important role classification. However, learning vectors is crucial to obtain good results.

参考文章(14)
David Martin Ward Powers, None, Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation arXiv: Learning. ,vol. 2, pp. 37- 63 ,(2011)
Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)
Tomas Mikolov, Greg S. Corrado, Kai Chen, Jeffrey Dean, Efficient Estimation of Word Representations in Vector Space international conference on learning representations. ,(2013)
Yoon Kim, Convolutional Neural Networks for Sentence Classification empirical methods in natural language processing. pp. 1746- 1751 ,(2014) , 10.3115/V1/D14-1181
Larry Manevitz, Malik Yousef, One-class document classification via Neural Networks Neurocomputing. ,vol. 70, pp. 1466- 1481 ,(2007) , 10.1016/J.NEUCOM.2006.05.013
Ilya Sutskever, Geoffrey Hinton, Alex Krizhevsky, Ruslan Salakhutdinov, Nitish Srivastava, Dropout: a simple way to prevent neural networks from overfitting Journal of Machine Learning Research. ,vol. 15, pp. 1929- 1958 ,(2014)
Min-Ling Zhang, Zhi-Hua Zhou, Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization IEEE Transactions on Knowledge and Data Engineering. ,vol. 18, pp. 1338- 1351 ,(2006) , 10.1109/TKDE.2006.162
Grigorios Tsoumakas, Ioannis Katakis, Multi-label classification: An overview International Journal of Data Warehousing and Mining. ,vol. 3, pp. 1- 13 ,(2007) , 10.4018/JDWM.2007070101
Ilya Sutskever, Tomas Mikolov, Greg S Corrado, Kai Chen, Jeff Dean, Distributed Representations of Words and Phrases and their Compositionality neural information processing systems. ,vol. 26, pp. 3111- 3119 ,(2013)
Ronan Collobert, Pavel Kuksa, Léon Bottou, Koray Kavukcuoglu, Michael Karlen, Jason Weston, Natural Language Processing (Almost) from Scratch Journal of Machine Learning Research. ,vol. 12, pp. 2493- 2537 ,(2011)