SAPKOS: Experimental Czech Multi-label Document Classification and Analysis System

作者: Ladislav Lenc , Pavel Král

DOI: 10.1007/978-3-319-23868-5_24

关键词: Principle of maximum entropyCzechTask (project management)Computer scienceMachine learningSystems architectureMeasure (data warehouse)Artificial intelligenceDocument classificationAnnotationInformation retrievalNewspaper

摘要: This paper presents an experimental multi-label document classification and analysis system called SAPKOS. The which integrates the state-of-the-art machine learning natural language processing approaches is intended to be used by Czech news Agency (CTK). Its main purpose save human resources in task of annotation newspaper articles with topics. Another important functionality automatic comparison CTK production popular media. results this will adapt better correspond today’s market requirements. An interesting contribution that, best our knowledge, no other exists. It also worth mentioning that accuracy very high. score obtained due unique architecture a maximum entropy based engine novel confidence measure method.

参考文章(32)
David Martin Ward Powers, None, Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation arXiv: Learning. ,vol. 2, pp. 37- 63 ,(2011)
Tomáš Brychcín, Pavel Král, Novel Unsupervised Features for Czech Multi-label Document Classification mexican international conference on artificial intelligence. pp. 70- 79 ,(2014) , 10.1007/978-3-319-13647-9_8
Michal Hrala, Pavel Král, Evaluation of the Document Classification Approaches computer recognition systems. pp. 877- 885 ,(2013) , 10.1007/978-3-319-00969-8_86
Harris Papadopoulos, A Cross-Conformal Predictor for Multi-label Classification artificial intelligence applications and innovations. pp. 241- 250 ,(2014) , 10.1007/978-3-662-44722-2_26
Michal Hrala, Pavel Král, Multi-label Document Classification in Czech text speech and dialogue. pp. 343- 351 ,(2013) , 10.1007/978-3-642-40585-3_44
Mohamed Ahmed Rashad, Hesham El-Deeb, Mohamed Waleed Fakhr, Document Classification Using Enhanced Grid Based Clustering Algorithm Springer, Cham. pp. 207- 215 ,(2015) , 10.1007/978-3-319-06764-3_27
Michal Konkol, Brainy: A Machine Learning Library Artificial Intelligence and Soft Computing. pp. 490- 499 ,(2014) , 10.1007/978-3-319-07176-3_43
R. Chandrasekar, B. Srinivas, Using syntactic information in document filtering: a comparative study of part-of-speech tagging and supertagging RIAO '97 Computer-Assisted Information Searching on Internet. pp. 531- 545 ,(1997)
Pavel Král, Named Entities as New Features for Czech Document Classification Computational Linguistics and Intelligent Text Processing. pp. 417- 427 ,(2014) , 10.1007/978-3-642-54903-8_35