MUSK: Uniform sampling of k maximal patterns

作者: Mohammad Al Hasan , Mohammed Javeed Zaki

DOI:

关键词:

摘要: Recent research in frequent pattern mining (FPM) has shifted from obtaining the complete set of patterns to generating only a representative (summary) subset patterns. Most existing approaches this problem adopt two-step solution; first step, they obtain all patterns, and second some form clustering is used summary set. However, twostep method inefficient sometimes infeasible since step itself may fail finish reasonable amount time. In paper, we propose an alternative approach representatives based on uniform sampling output space. Our new algorithm, Musk, obtains by uniformly pool maximal patterns; uniformity achieved variant Markov Chain Monte Carlo (MCMC) algorithm. Musk simulates random walk partial order graph with prescribed transition probability matrix, whose values are computed locally during simulation. stationary distribution walk, nodes sampled uniformly. Experiments various kind itemset databases validate effectiveness our approach.

参考文章(21)
Fan R K Chung, Spectral Graph Theory ,(1996)
Akihiro Inokuchi, Takashi Washio, Hiroshi Motoda, Complete Mining of Frequent Patterns from Graphs: Mining Graph Data Machine Learning. ,vol. 50, pp. 321- 354 ,(2003) , 10.1023/A:1021726221443
Siegfried Nijssen, Joost N. Kok, A quickstart in frequent structure mining can make a difference knowledge discovery and data mining. pp. 647- 652 ,(2004) , 10.1145/1014052.1014134
Guizhen Yang, The complexity of mining maximal frequent itemsets and maximal frequent patterns Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '04. pp. 344- 353 ,(2004) , 10.1145/1014052.1014091
Karam Gouda, Mohammed J. Zaki, GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets Data Mining and Knowledge Discovery. ,vol. 11, pp. 223- 242 ,(2005) , 10.1007/S10618-005-0002-X
Foto Afrati, Aristides Gionis, Heikki Mannila, Approximating a collection of frequent sets knowledge discovery and data mining. pp. 12- 19 ,(2004) , 10.1145/1014052.1014057
Vineet Chaoji, Mohammad Al Hasan, Saeed Salem, Mohammed J. Zaki, An integrated, generic approach to pattern mining: data mining template library Data Mining and Knowledge Discovery. ,vol. 17, pp. 457- 495 ,(2008) , 10.1007/S10618-008-0098-X
Xifeng Yan, Jiawei Han, CloseGraph: mining closed frequent graph patterns knowledge discovery and data mining. pp. 286- 295 ,(2003) , 10.1145/956750.956784