Document Classification Using Enhanced Grid Based Clustering Algorithm

作者： Mohamed Ahmed Rashad , Hesham El-Deeb , Mohamed Waleed Fakhr

关键词: Artificial intelligence 、 Cluster analysis 、 Document clustering 、 Canopy clustering algorithm 、 Correlation clustering 、 k-means clustering 、 CURE data clustering algorithm 、 Pattern recognition 、 Computer science 、 Data stream clustering 、 Document classification

摘要: Automated document clustering is an important text mining task especially with the rapid growth of number online documents present in Arabic language. Text aims to automatically assign a predefined cluster based on linguistic features. This research proposes enhanced grid algorithm. The main purpose this algorithm divide data space into clusters arbitrary shape. These are considered as dense regions points that separated by low density representing noise. Also it deals making set multi-densities and assigning noise outliers closest category. will reduce time complexity. Unclassified preprocessed removing stops words extracting word root used dimensionality feature vectors documents. Each then represented vector their frequencies. accuracy presented according consumption percentage successfully clustered instances. results experiments were carried out in-house collected have proven its effectiveness average 89 %.

springer.com 本地加速

springer.com LINK 下载加速

sci-hub.se PDF 下载加速

参考文章(11)

Hasan Muaidi Al-Serhan, G. Kannan, R. Al Shalabi, New approach for extracting Arabic roots ,(2003)

Jian Li, Wei Yu, Bao-Ping Yan, Memory effect in DBSCAN algorithm international conference on computer science and education. pp. 31- 36 ,(2009) , 10.1109/ICCSE.2009.5228532

Mahmud S.Alkoffash, Automatic Arabic Text Clustering using K-means and K-mediods International Journal of Computer Applications. ,vol. 51, pp. 5- 8 ,(2012) , 10.5120/8012-0675

Osama A.Ghanem, Wesam M. Ashour, Stemming Effectiveness in Clustering of Arabic Documents International Journal of Computer Applications. ,vol. 49, pp. 1- 6 ,(2012) , 10.5120/7620-0674

J. Hencil Peter, A. Antonysamy, An Optimised Density Based Clustering Algorithm International Journal of Computer Applications. ,vol. 6, pp. 16- 19 ,(2010) , 10.5120/1102-1445

Raghuvira Pratap, K Suvarna, J Rama, Dr.K Nageswara, An Efficient Density based Improved K- Medoids Clustering algorithm International Journal of Advanced Computer Science and Applications. ,vol. 2, ,(2011) , 10.14569/IJACSA.2011.020607

Mahmud S. Alkoffash, Comparing between Arabic Text Clustering using K Means and K Mediods ,(2012)

Dina Adel Said, DIMENSIONALITY REDUCTION TECHNIQUES FOR ENHANCING AUTOMATIC TEXT CATEGORIZATION ,(2007)

A Anil Kumar, S Chandrasekhar, None, Text Data Pre-processing and Dimensionality Reduction Techniques for Document Clustering International journal of engineering research and technology. ,vol. 1, ,(2012)

10.

Priyanka Thrikha, Singh Vijendra, None, Fast Density Based Clustering Algorithm International Journal of Machine Learning and Computing. pp. 10- 12 ,(2013) , 10.7763/IJMLC.2013.V3.262

Document Classification Using Enhanced Grid Based Clustering Algorithm

来源期刊

我的账户

Document Classification Using Enhanced Grid Based Clustering Algorithm

来源期刊

相似文章 2

SAPKOS: Experimental Czech Multi-label Document Classification and Analysis System

Topology: A Theory of a Pseudometric-Based Clustering Model and Its Application in Content-Based Image Retrieval

我的账户