作者: Pavel Král , Ladislav Lenc
DOI: 10.1007/978-3-319-18117-2_39
关键词: Artificial intelligence 、 Computer science 、 Classifier (UML) 、 Machine learning 、 Document classification 、 Czech 、 Latent Dirichlet allocation
摘要: This paper deals with automatic document classification in the context of a real application for Czech News Agency (CTK). The accuracy our classifier is high, however it still important to improve results. main goal this thus propose novel confidence measure approaches order detect and remove incorrectly classified samples. Two proposed methods are based on posterior class probability third one supervised approach which uses another determine if result correct. evaluated newspaper corpus. We experimentally show that beneficial integrate into task because they significantly accuracy.