An improved KNN algorithm for text classification

作者: Jingzhong Wang , Xia Li

DOI: 10.1109/ICINA.2010.5636476

关键词:

摘要: This paper analyzes the advantages and disadvantages of KNN alogrithm introduces an improved (WPSOKN) for text classification. It is based on particle swarm optimization which has ability random directed global search within training document set. During procedure searching k nearest neighbors test sample, those vectors that are impossible to be closest kicked out quickly. Besides it reduces impact individual particles from overall. Moreover, interference factor introduced avoid premature find samples We conducted extensive experimental study using real datasets, results show WPSOKNN algorithm more efficient than other algorithm.

参考文章(5)
Yiming Yang, A study of thresholding strategies for text categorization international acm sigir conference on research and development in information retrieval. pp. 137- 145 ,(2001) , 10.1145/383952.383975
Gisli R. Hjaltason, Hanan Samet, Index-driven similarity search in metric spaces (Survey Article) ACM Transactions on Database Systems. ,vol. 28, pp. 517- 580 ,(2003) , 10.1145/958942.958948
David W. Aha, Dennis Kibler, Marc K. Albert, Instance-Based Learning Algorithms Machine Learning. ,vol. 6, pp. 37- 66 ,(1991) , 10.1023/A:1022689900470
D.A. White, R. Jain, Similarity indexing with the SS-tree international conference on data engineering. pp. 516- 523 ,(1996) , 10.1109/ICDE.1996.492202
Yu Wang, Zheng-Ou Wang, Text categorization rule extraction based on fuzzy decision tree international conference on machine learning and cybernetics. ,vol. 4, pp. 2122- 2127 ,(2005) , 10.1109/ICMLC.2005.1527296