Research and Application of DBSCAN Algorithm Based on Hadoop Platform

作者： Xiufen Fu , Yaguang Wang , Yanna Ge , Peiwen Chen , Shaohua Teng

关键词:

摘要: Along with the rapid development of information age, more and data can be obtained from Internet, it is very difficult to get useful knowledge these huge amounts data. On foundation existing algorithm based on DBSCAN, a new improved incremental DBSCAN clustering proposed. Combining cloud computing open source framework Hadoop, use programming model MapReduce which easy write distributed applications simplify programme divide elements into chunks distribute across cluster run as job, in this way, mining integrated Hadoop by algorithm. When manipulation (add or delete) has occurred database, what we need do mine mutative merge similar clusters, ultimately form final mining.Compared single node server serial arithmetic overall mining, time delay processing will reduced. In last part,the paper verified effectiveness experiments analysis.

参考文章(9)

Guanghui Xu, Feng Xu, Hongxu Ma, Deploying and researching Hadoop in virtual machines 2012 IEEE International Conference on Automation and Logistics. pp. 395- 399 ,(2012) , 10.1109/ICAL.2012.6308241

Shiori Kurazumi, Tomoaki Tsumura, Shoichi Saito, Hiroshi Matsuo, Dynamic Processing Slots Scheduling for I/O Intensive Jobs of Hadoop MapReduce international conference on networking and computing. pp. 288- 292 ,(2012) , 10.1109/ICNC.2012.53

Yaobin He, Haoyu Tan, Wuman Luo, Huajian Mao, Di Ma, Shengzhong Feng, Jianping Fan, MR-DBSCAN: An Efficient Parallel Density-Based Clustering Algorithm Using MapReduce international conference on parallel and distributed systems. pp. 473- 480 ,(2011) , 10.1109/ICPADS.2011.83

Matei Zaharia, Ariel Rabkin, Michael Armbrust, David A. Patterson, Andrew Konwinski, Anthony D. Joseph, Gunho Lee, Ion Stoica, Randy H. Katz, Armando Fox, Rean Griffith, Above the Clouds: A Berkeley View of Cloud Computing Science. ,vol. 53, pp. 07- 013 ,(2009)

C.C. Aggarwal, P.S. Yu, A Survey of Uncertain Data Algorithms and Applications IEEE Transactions on Knowledge and Data Engineering. ,vol. 21, pp. 609- 623 ,(2009) , 10.1109/TKDE.2008.190

Jeffrey Dean, Sanjay Ghemawat, MapReduce Communications of the ACM. ,vol. 51, pp. 107- 113 ,(2008) , 10.1145/1327452.1327492

Yuan Jin-sheng, Text Clustering Based on Improved DBSCAN Algorithm Computer Engineering. ,(2011)

Liu Wen, Study of Chameleon Clustering Algorithm and Implementation in Weka Computer Systems and Applications. ,(2010)

Wang Jiandong, Zhai Zhigang, Secure Model of Distributed Database Based on UCON Computer Engineering. ,vol. 37, pp. 50- 51 ,(2011)

Research and Application of DBSCAN Algorithm Based on Hadoop Platform

来源期刊

我的账户

Research and Application of DBSCAN Algorithm Based on Hadoop Platform

来源期刊

相似文章 4

Scalable Clustering by Iterative Partitioning and Point Attractor Representation

Big data clustering with varied density based on MapReduce

Theoretically-Efficient and Practical Parallel DBSCAN

Theoretically-Efficient and Practical Parallel DBSCAN

我的账户