A fast algorithm for frequent itemset mining using Patricia* structures

作者: Jun-Feng Qu , Mengchi Liu

DOI: 10.1007/978-3-642-32584-7_17

关键词:

摘要: Efficient mining of frequent itemsets from a database plays an essential role in many data tasks such as association rule mining. Many algorithms use prefix-tree to represent and mine by constructing recursively conditional prefix-trees the prefix-tree. A (conditional) can be stored various structures. The construction traversal costs prefix-trees, or rather their storage structures, take large proportion whole cost for algorithms. PatriciaMine algorithm employs Patricia trie store shows good performance. In this study, we introduce efficient Patricia* structure storing is more compact contiguous than corresponding trie, thus former are less those latter. Previous prefix-tree-based adopt similar procedure, which most nodes repeatedly accessed when processed. paper presents novel procedure node accesses greatly reduced. We propose PatriciaMine* that combination with proposed procedure. Experimental show outperforms not only but also several fast algorithms, FPgrowth* dEclat, databases.

参考文章(15)
Guimei Liu, Jeffrey Xu Yu, Xiangye Xiao, Hongjun Lu, Wei Wang, AFOPT: An Efficient Implementation of Pattern Growth Approach. FIMI. ,(2003)
Lars Schmidt-Thieme, Algorithmic Features of Eclat. FIMI. ,(2004)
Andrea Pietracaprina, Dario Zandolin, Mining Frequent Itemsets using Patricia Tries. FIMI. ,(2003)
Yuh-Jiuan Tsay, Tain-Jung Hsu, Jing-Rung Yu, FIUT: A new method for mining frequent itemsets Information Sciences. ,vol. 179, pp. 1724- 1737 ,(2009) , 10.1016/J.INS.2009.01.010
Guimei Liu, Jeffrey Xu Yu, Hongjun Lu, Yabo Xu, Wenwu Lou, Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree Data Mining and Knowledge Discovery. ,vol. 9, pp. 249- 274 ,(2004) , 10.1023/B:DAMI.0000040905.52966.1A
Mohammed J. Zaki, Karam Gouda, Fast vertical mining using diffsets knowledge discovery and data mining. pp. 326- 335 ,(2003) , 10.1145/956750.956788
Hoang Thanh Lam, Toon Calders, Mining top-k frequent items in a data stream with flexible sliding windows knowledge discovery and data mining. pp. 283- 292 ,(2010) , 10.1145/1835804.1835842
Tobias Bjerregaard, Shankar Mahadevan, A survey of research and practices of Network-on-chip ACM Computing Surveys. ,vol. 38, pp. 1- 51 ,(2006) , 10.1145/1132952.1132953
Jiawei Han, Jian Pei, Yiwen Yin, Runying Mao, Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach Data Mining and Knowledge Discovery. ,vol. 8, pp. 53- 87 ,(2004) , 10.1023/B:DAMI.0000005258.31418.83
Toon Calders, Calin Garboni, Bart Goethals, Approximation of Frequentness Probability of Itemsets in Uncertain Data international conference on data mining. pp. 749- 754 ,(2010) , 10.1109/ICDM.2010.42