Implementation Issues in the Design of I/O Intensive Data Mining Applications on Clusters of Workstations

作者: R. Baraglia , D. Laforenza , Salvatore Orlando , P. Palmerini , Raffaele Perego

DOI: 10.1007/3-540-45591-4_46

关键词: Knowledge acquisitionInput/outputInformation systemData miningComputer scienceParallel algorithmImplementationCluster analysisWorkstationScalability

摘要: This paper investigates scalable implementations of out-of-core I/O-intensive Data Mining algorithms on affordable parallel architectures, such as clusters w orkstations. In order to validate our approach, the K-means algorithm, a well known DM Clustering was used test case.

参考文章(15)
Michael Beck, Verworner, Mirko Dziadzka, Magnus, Kunitz, Harold Bohme, Robert Magnus, Harald Bohme, Linux Kernel Internals with Cdrom Addison-Wesley Longman Publishing Co., Inc.. ,(1997)
Rajkumar Buyya, High Performance Cluster Computing ,(1999)
Kilian Stoffel, Abdelkader Belkoniene, Parallel k/h-Means Clustering for Large Data Sets european conference on parallel processing. pp. 1451- 1454 ,(1999) , 10.1007/3-540-48311-X_205
John Salmon, Daniel F. Savarese, Thomas L. Sterling, Donald J. Becker, How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters ,(1999)
S. H. Lavington, Alex A. Freitas, Mining Very Large Databases with Parallel Processing ,(1997)
Inderjit S. Dhillon, Dharmendra S. Modha, A Data-Clustering Algorithm on Distributed Memory Multiprocessors knowledge discovery and data mining. pp. 245- 260 ,(1999) , 10.1007/3-540-46502-2_13
Richard C. Dubes, Anil K. Jain, Algorithms for clustering data ,(1988)
Eui-Hong Han, G. Karypis, V. Kumar, Scalable parallel data mining for association rules IEEE Transactions on Knowledge and Data Engineering. ,vol. 12, pp. 337- 352 ,(2000) , 10.1109/69.846289
M. Srinivas, L.M. Patnaik, Genetic algorithms: a survey IEEE Computer. ,vol. 27, pp. 17- 26 ,(1994) , 10.1109/2.294849
Jeffrey Scott Vitter, External memory algorithms and data structures External memory algorithms. pp. 1- 38 ,(1999)