MINING CONDENSED SETS OF FREQUENT EPISODES WITH MORE ACCURATE FREQUENCIES FROM COMPLEX SEQUENCES

作者: Honghua Dai , Min Gan

DOI:

关键词:

摘要: Many previous approaches to frequent episode discovery only accept sim- ple sequences. Although a recent approach has been able tond episodes from complex sequences, the discovered sets are neither condensed nor accurate. This paper investigates of We adopt novel anti-monotonic frequency measure based on non-redundant occurrences, and dene set, nDaCF (the set non-derivable approximately closed fre- quent episodes) within given maximal error bound support. then introduce series effective pruning strategies, develop method, - Miner, for discov- ering sets. Experimental results show that, when is somewhat high, two orders magnitude smaller than complete sets, nDaCF-miner more efficient mining approaches. In addition, accurate found by Keywords: Frequent episodes, Condensed Sequence data

参考文章(16)
Heikki Mannila, A. Inkeri Verkamo, Hannu Toivonen, Discovering Frequent Episodes in Sequences. knowledge discovery and data mining. pp. 210- 215 ,(1995)
Heikki Mannila, Hannu Toivonen, Discovering generalized episodes using minimal occurrences knowledge discovery and data mining. pp. 146- 151 ,(1996)
Peilin Jiang, Fuji Ren, Nanning Zheng, A NEW APPROACH TO DATA CLUSTERING USING A COMPUTATIONAL VISUAL ATTENTION MODEL International Journal of Innovative Computing Information and Control. ,vol. 5, pp. 4597- 4605 ,(2009)
Honghua Dai, Min Gan, Obtaining accurate frequencies of sequential patterns over a single sequence ICIC express letters. ,vol. 5, pp. 1461- 1466 ,(2011)
Guozhu Dong, Jian Pei, Sequence data mining ,(2007)
Nicolas Pasquier, Yves Bastide, Rafik Taouil, Lotfi Lakhal, Discovering Frequent Closed Itemsets for Association Rules international conference on database theory. ,vol. 1540, pp. 398- 416 ,(1999) , 10.1007/3-540-49257-7_25
Kuo-Yu Huang, Chia-Hui Chang, Efficient mining of frequent episodes from complex sequences Information Systems. ,vol. 33, pp. 96- 114 ,(2008) , 10.1016/J.IS.2007.07.003
K. Iwanuma, R. Ishihara, Yo Takano, H. Nabeshima, Extracting frequent subsequences from a single long data sequence a novel anti-monotonic measure and a simple on-line algorithm international conference on data mining. pp. 186- 193 ,(2005) , 10.1109/ICDM.2005.60
Bolin Ding, David Lo, Jiawei Han, Siau-Cheng Khoo, Efficient Mining of Closed Repetitive Gapped Subsequences from a Sequence Database international conference on data engineering. pp. 1024- 1035 ,(2009) , 10.1109/ICDE.2009.104
Jian Pei, Guozhu Dong, Wei Zou, Jiawei Han, On computing condensed frequent pattern bases international conference on data mining. pp. 378- 385 ,(2002) , 10.1109/ICDM.2002.1183928