Frequent itemset mining on graphics processors

作者: Wenbin Fang , Mian Lu , Xiangye Xiao , Bingsheng He , Qiong Luo

DOI: 10.1145/1565694.1565702

关键词: Central processing unitTrieSpeedupParallel computingCUDASIMDGraphicsComputer scienceData structureBitmap

摘要: We present two efficient Apriori implementations of Frequent Itemset Mining (FIM) that utilize new-generation graphics processing units (GPUs). Our take advantage the GPU's massively multi-threaded SIMD (Single Instruction, Multiple Data) architecture. Both employ a bitmap data structure to exploit parallelism and accelerate frequency counting operation. One implementation runs entirely on GPU eliminates intermediate transfer between memory CPU memory. The other employs both for processing. It represents itemsets in trie, uses trie traversing incremental maintenance. preliminary results show achieve speedup up orders magnitude over optimized PC with an NVIDIA GTX 280 quad-core CPU.

参考文章(32)
Ferenc Bodon, A FAST APRIORI IMPLEMENTATION FIMI. pp. 0- 0 ,(2003)
Huan Liu, Ehtesham Haque, Lance Parsons, Evaluating Subspace Clustering Algorithms ,(2004)
Srinivasan Parthasarathy, Mitsunori Ogihara, Mohammed J Zaki, Wei Li, New algorithms for fast discovery of association rules knowledge discovery and data mining. pp. 283- 286 ,(1997)
Ramakrishnan Srikant, Rakesh Agrawal, Fast algorithms for mining association rules very large data bases. pp. 580- 592 ,(1998)
Ramakrishnan Srikant, Rakesh Agrawal, Fast Algorithms for Mining Association Rules in Large Databases very large data bases. pp. 487- 499 ,(1994)
Gokhan Memik, Jayaprakash Pisharath, Alok Choudhary, Wei-keng Liao, Pradeep Dubey, Ying Liu, NU-MineBench: Understanding the Performance and Scalability Characteristics of Data Mining Algorithms ,(2004)
M.J. Zaki, Parallel and distributed association mining: a survey IEEE Concurrency. ,vol. 7, pp. 14- 25 ,(1999) , 10.1109/4434.806975
Gregory Buehrer, Srinivasan Parthasarathy, Shirish Tatikonda, Tahsin Kurc, Joel Saltz, Toward terabyte pattern mining Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '07. pp. 2- 12 ,(2007) , 10.1145/1229428.1229432
S. Parthasarathy, M. J. Zaki, M. Ogihara, W. Li, Parallel data mining for association rules on shared memory systems Knowledge and Information Systems. ,vol. 3, pp. 1- 29 ,(2001) , 10.1007/PL00011656
Bingsheng He, Ke Yang, Rui Fang, Mian Lu, Naga Govindaraju, Qiong Luo, Pedro Sander, Relational joins on graphics processors Proceedings of the 2008 ACM SIGMOD international conference on Management of data - SIGMOD '08. pp. 511- 524 ,(2008) , 10.1145/1376616.1376670