Topic Classification Problem Solving for Morphologically Complex Languages

作者: Jurgita Kapočiūtė-Dzikienė , Tomas Krilavičius

DOI: 10.1007/978-3-319-46254-7_41

关键词:

摘要: In this paper we are presenting a topic classification task for the morphologically complex Lithuanian and Russian languages, using popular supervised machine learning techniques. our research experimentally investigated two text methods big variety of feature types covering different levels abstraction: character, lexical, morpho-syntactic. order to have comparable results both kept experimental conditions as similar possible: datasets were composed normative texts, taken from news portals; contained topics; had same number texts in each topic.

参考文章(34)
Maseud Rahgozar, Mohamad Hasan Ahmadi, Bahareh Bina, Farsi Text Classification Using N-Grams and Knn Algorithm A Comparative Study. DMIN. pp. 385- 390 ,(2008)
Mateusz Westa, Julian Szymański, Henryk Krawczyk, Text classifiers for automatic articles categorization international conference on artificial intelligence and soft computing. pp. 196- 204 ,(2012) , 10.1007/978-3-642-29350-4_24
Michal Hrala, Pavel Král, Evaluation of the Document Classification Approaches computer recognition systems. pp. 877- 885 ,(2013) , 10.1007/978-3-319-00969-8_86
Michal Hrala, Pavel Král, Multi-label Document Classification in Czech text speech and dialogue. pp. 343- 351 ,(2013) , 10.1007/978-3-642-40585-3_44
Frederik Vaassen, Walter Daelemans, Jurgita Kapociute-Dzikiene, Algis KrupaviÄius, Improving Topic Classification for Highly Inflective Languages international conference on computational linguistics. pp. 1393- 1410 ,(2012)
S. Kotsiantis, M. Ikonomakis, V. Tampakas, Text Classification Using Machine Learning Techniques ,(2005)
Edda Leopold, Jörg Kindermann, Text Categorization with Support Vector Machines. How to Represent Texts in Input Space Machine Learning. ,vol. 46, pp. 423- 444 ,(2002) , 10.1023/A:1012491419635
Accurate stemming of Dutch for text classification computational linguistics in the netherlands. pp. 104- 117 ,(2002) , 10.1163/9789004334038_010
Adel Hamdan Mohammad, Tariq Alwada‘n, Omar Al-Momani, Arabic Text Categorization Using Support vector machine, Naïve Bayes and Neural Network GSTF Journal on Computing (JoC). ,vol. 5, pp. 192- 199 ,(2016) , 10.7603/S40601-016-0016-9
Kamal Nigam, Andrew McCallum, A comparison of event models for naive bayes text classification national conference on artificial intelligence. pp. 41- 48 ,(1998)