Knowledge Discovery from Large Semi-structured Data Streams

作者: Setsuo Arikawa , Shinji Kawasoe , Kenji Abe , Hiroki Arimura , Tatsuya Asai

DOI:

关键词:

摘要: In this paper, we study an online data mining problem from a stream of semi-structured such as XML data. Modeling and patterns labeled ordered trees, present algorithm StreamT that receives fragments unseen possibly infinite in the document order through stream, can return current set frequent immediately on request at any time. Moreover, discuss candidate management policy StreamT. We some policies empirically behavior algorithms with each policy. Experiments show forgetting model computes really without influenced by past events.

参考文章(17)
節夫 有川, 比呂志 坂本, 真治 川副, Setsuo Arikawa, 賢治 安部, 達哉 浅井, 博紀 有村, Shinji Kawasoe, Kenji Abe, Hiroshi Sakamoto, Hiroki Arimura, Tatsuya Asai, Efficient Substructure Discovery from Large Semi-structed Data DOI Technical Report. ,vol. 200, ,(2001)
Ramakrishnan Srikant, Rakesh Agrawal, Fast Algorithms for Mining Association Rules in Large Databases very large data bases. pp. 487- 499 ,(1994)
Kenji Abe, Shinji Kawasoe, Tatsuya Asai, Hiroki Arimura, Setsuo Arikawa, Optimized Substructure Discovery for Semi-structured Data european conference on principles of data mining and knowledge discovery. ,vol. 206, pp. 1- 14 ,(2002) , 10.1007/3-540-45681-3_1
T. Asai, Efficient substructure discovery from large semi-structured data siam international conference on data mining. pp. 158- 174 ,(2002)
Takashi Matsuda, Tadashi Horiuchi, Hiroshi Motoda, Takashi Washio, Kohei Kumazawa, Naohide Arai, Graph-Based Induction for General Graph Structured Data discovery science. pp. 340- 342 ,(1999) , 10.1007/3-540-46846-3_39
S. Parthasarathy, M. J. Zaki, M. Ogihara, S. Dwarkadas, Incremental and interactive sequence mining conference on information and knowledge management. pp. 251- 258 ,(1999) , 10.1145/319950.320010
Yossi Matias, Phillip B. Gibbons, Synopsis data structures for massive data sets symposium on discrete algorithms. pp. 909- 910 ,(1999)
Christian Hidber, Online association rule mining ACM SIGMOD Record. ,vol. 28, pp. 145- 156 ,(1999) , 10.1145/304181.304195
Kenji Yamanishi, Jun-ichi Takeuchi, A unifying framework for detecting outliers and change points from non-stationary time series data Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '02. pp. 676- 681 ,(2002) , 10.1145/775047.775148
Tetsuhiro Miyahara, Yusuke Suzuki, Takayoshi Shoudai, Tomoyuki Uchida, Kenichi Takahashi, Hiroaki Ueda, Discovery of Frequent Tag Tree Patterns in Semistructured Web Documents knowledge discovery and data mining. pp. 341- 355 ,(2002) , 10.1007/3-540-47887-6_35