Dual incremental fuzzy schemes for frequent itemsets discovery in streaming numeric data

作者: Hui Zheng , Peng Li , Qing Liu , Jinjun Chen , Guangli Huang

DOI: 10.1016/J.INS.2019.11.023

关键词:

摘要: Abstract Discovering frequent itemsets is essential for finding association rules, yet too computational expensive using existing algorithms. It even more challenging to find upon streaming numeric data. The characteristic leads a challenge that data cannot be scanned repetitively. requires should pre-processed into itemsets, e.g., fuzzy-set methods can transform with non-integer membership values. This the frequency of are usually not integer. To overcome such challenges, fast and stream processing have been applied. However, algorithms either still need re-visit some previous multiple times, or count frequencies. Those re-visiting sacrifice large memory spaces cache those avoid repetitive scanning. When dealing big nowadays, large-memory requirement often goes beyond capacity many computers. unable frequencies would very inaccurate in estimating if used integer approximation frequency-counting. solve aforementioned issues, this paper we propose two incremental schemes discovery capable work efficiently In particular, they able without any key our benefits efficiency extract statistics occupy much less than raw do ongoing grants advantages 1) allowing counting thus natural integration discretization method boost robustness anti-noise capability data, 2) enabling design decay ratio different distributions, which adapted three general models: landmark, damped sliding windows, 3) achieving highly-accurate fuzzy-item-sets efficient stream-processing. Experimental studies demonstrate effectiveness dual both synthetic real-world datasets.

参考文章(36)
Tzung-Pei Hong, Ching-Yao Wang, Yu-Hui Tao, A new incremental data mining algorithm using pre-large itemsets intelligent data analysis. ,vol. 5, pp. 111- 129 ,(2001) , 10.3233/IDA-2001-5203
Ramakrishnan Srikant, Rakesh Agrawal, Fast algorithms for mining association rules very large data bases. pp. 580- 592 ,(1998)
Zequn Zhou, C.I. Ezeife, A Low-Scan Incremental Association Rule Maintenance Method Based on the Apriori Property Advances in Artificial Intelligence. pp. 26- 35 ,(2001) , 10.1007/3-540-45153-6_3
Syed Khairuzzaman Tanbeer, Chowdhury Farhan Ahmed, Byeong-Soo Jeong, Young-Koo Lee, None, CP-tree: a tree structure for single-pass frequent pattern mining knowledge discovery and data mining. pp. 1022- 1027 ,(2008) , 10.1007/978-3-540-68125-0_108
Yun Chi, Haixun Wang, P.S. Yu, R.R. Muntz, Moment: maintaining closed frequent itemsets over a stream sliding window international conference on data mining. pp. 59- 66 ,(2004) , 10.1109/ICDM.2004.10084
Tarek F Gharib, Hamed Nassar, Mohamed Taha, Ajith Abraham, None, An efficient algorithm for incremental mining of temporal association rules data and knowledge engineering. ,vol. 69, pp. 800- 815 ,(2010) , 10.1016/J.DATAK.2010.03.002
Bo He, Fast Mining Algorithm of Association Rules Base on Cloud Computing electronic and mechanical engineering and information technology. pp. 2209- 2212 ,(2012) , 10.2991/EMEIT.2012.489
Liang Wang, D Cheung, Reynold Cheng, S Lee, Xuan Yang, Efficient Mining of Frequent Item Sets on Large Uncertain Databases IEEE Transactions on Knowledge and Data Engineering. ,vol. 24, pp. 2170- 2183 ,(2012) , 10.1109/TKDE.2011.165
Wen Chuan Yang, Qing Yi Qu, Peng Fei Ma, An Improved Incremental Queue Association Rules for Mining Mass Text Advanced Materials Research. pp. 2687- 2690 ,(2014) , 10.4028/WWW.SCIENTIFIC.NET/AMR.962-965.2687