GPUMAFIA: Efficient Subspace Clustering with MAFIA on GPUs

作者： Andrew Adinetz , Jiri Kraus , Jan Meinke , Dirk Pleiter

关键词:

摘要: Clustering, i.e., the identification of regions similar objects in a multi-dimensional data set, is standard method analytics with large variety applications. For high-dimensional data, subspace clustering can be used to find clusters among certain subset point dimensions and alleviate curse dimensionality. In this paper we focus on MAFIA algorithm using GPUs accelerate algorithm. We first present number algorithmic changes estimate their effect computational complexity These improve sequential version by 1---2 orders magnitude practical datasets while providing exactly same output. then GPU algorithm, which for typical provides further speedup over single CPU core or about an order multi-core CPU. believe that our faster implementation widens applicability clustering.

参考文章(26)

Harsha Nagesh, Sanjay Goil, Alok Choudhary, Parallel Algorithms for Clustering High-Dimensional Large-Scale Datasets Springer, Boston, MA. pp. 335- 356 ,(2001) , 10.1007/978-1-4615-1733-7_19

Huan Liu, Ehtesham Haque, Lance Parsons, Evaluating Subspace Clustering Algorithms ,(2004)

Elke Achtert, Christian Böhm, Hans-Peter Kriegel, Peer Kröger, Ina Müller-Gorman, Arthur Zimek, Detection and Visualization of Subspace Cluster Hierarchies Advances in Databases: Concepts, Systems and Applications. pp. 152- 163 ,(2007) , 10.1007/978-3-540-71703-4_15

Lisha Ma, Stratis D. Viglas, Meng Li, Qian Li, Stream Operators for Querying Data Streams Advances in Web-Age Information Management. pp. 404- 415 ,(2005) , 10.1007/11563952_36

Karin Kailing, Hans-Peter Kriegel, Peer Kroger, Density-Connected Subspace Clustering for High-Dimensional Data siam international conference on data mining. pp. 246- 256 ,(2004)

Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan, Automatic subspace clustering of high dimensional data for data mining applications Proceedings of the 1998 ACM SIGMOD international conference on Management of data - SIGMOD '98. ,vol. 27, pp. 94- 105 ,(1998) , 10.1145/276304.276314

I Chiosa, A Kolb, GPU-Based Multilevel Clustering IEEE Transactions on Visualization and Computer Graphics. ,vol. 17, pp. 132- 145 ,(2011) , 10.1109/TVCG.2010.55

Christian Böhm, Robert Noll, Claudia Plant, Bianca Wackersreuther, Density-based clustering using graphics processors Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09. pp. 661- 670 ,(2009) , 10.1145/1645953.1646038

Ren Wu, Bin Zhang, Meichun Hsu, Clustering billions of data points using GPUs unconventional high performance computing. pp. 1- 6 ,(2009) , 10.1145/1531666.1531668

10.

D.T. Anderson, R.H. Luke, J.M. Keller, Speedup of Fuzzy Clustering Through Stream Processing on Graphics Processing Units IEEE Transactions on Fuzzy Systems. ,vol. 16, pp. 1101- 1106 ,(2008) , 10.1109/TFUZZ.2008.924203

GPUMAFIA: Efficient Subspace Clustering with MAFIA on GPUs

来源期刊

我的账户

GPUMAFIA: Efficient Subspace Clustering with MAFIA on GPUs

来源期刊

相似文章 6

Performance Evaluation of Scientific Applications on POWER8

Exascaling Your Library: Will Your Implementation Meet Your Expectations?

Paving the Road towards Pre-Exascale Supercomputing

Scalable Clustering by Iterative Partitioning and Point Attractor Representation

A Parallel Framework for Grid-Based Bottom-Up Subspace Clustering

Engineering Algorithms for Scalability through Continuous Validation of Performance Expectations

我的账户