Using Bloom Filters for Mining Top-k Frequent Itemsets in Data Streams

作者: Younghee Kim , Kyungsoo Cho , Jaeyeol Yoon , Ieejoon Kim , Ungmo Kim

DOI: 10.1007/978-3-642-22339-6_25

关键词:

摘要: In this paper, we study the problem of finding top-k most frequent itemsets in data streams. To only mine restricted to sub-domains workspace or result some query. Most previous algorithms are clearly not suitable for with limited memory, such as instance, an allocated each stream summary. Therefore, propose that order solve memory efficiency mining from massively and speedy a stream. Our algorithm is used bloom filter structure, named MineTop-k, which permit efficient computation maintenance results. We show our approach memory-efficient method problem.

参考文章(13)
Moses Charikar, Kevin Chen, Martin Farach-Colton, Finding Frequent Items in Data Streams international colloquium on automata languages and programming. ,vol. 312, pp. 693- 703 ,(2002) , 10.1016/S0304-3975(03)00400-6
Andrea Pietracaprina, Fabio Vandin, Efficient Incremental Mining of Top-K Frequent Closed Itemsets Discovery Science. pp. 275- 280 ,(2007) , 10.1007/978-3-540-75488-6_29
Martin Theobald, Gerhard Weikum, Ralf Schenkel, Top-k query evaluation with probabilistic guarantees very large data bases. pp. 648- 659 ,(2004) , 10.1016/B978-012088469-8.50058-9
Tran Minh Quang, Shigeru Oyanagi, Katsuhiro Yamazaki, ExMiner: an efficient algorithm for mining top-k frequent patterns advanced data mining and applications. pp. 436- 447 ,(2006) , 10.1007/11811305_48
Ahmed Metwally, Divyakant Agrawal, Amr El Abbadi, Efficient Computation of Frequent and Top-k Elements in Data Streams Database Theory - ICDT 2005. pp. 398- 412 ,(2004) , 10.1007/978-3-540-30570-5_27
Xiaojian Zhang, Huili Peng, A Sliding-Window Approach for Finding Top-k Frequent Itemsets from Uncertain Streams APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management. pp. 597- 603 ,(2009) , 10.1007/978-3-642-00672-2_57
Jiawei Han, Jianyong Wang, Ying Lu, P. Tzvetkov, Mining top-k frequent closed patterns without minimum support international conference on data mining. pp. 211- 218 ,(2002) , 10.1109/ICDM.2002.1183905
Yifeng Zhu, Hong Jiang, False Rate Analysis of Bloom Filter Replicas in Distributed Systems international conference on parallel processing. pp. 255- 262 ,(2006) , 10.1109/ICPP.2006.42
Jianyong Wang, J. Han, Y. Lu, P. Tzvetkov, TFP: an efficient algorithm for mining top-k frequent closed itemsets IEEE Transactions on Knowledge and Data Engineering. ,vol. 17, pp. 652- 664 ,(2005) , 10.1109/TKDE.2005.81
Graham Cormode, Flip Korn, S. Muthukrishnan, Divesh Srivastava, Finding hierarchical heavy hitters in data streams very large data bases. pp. 464- 475 ,(2003) , 10.1016/B978-012722442-8/50048-3