An Unsupervised Algorithm for Segmenting Categorical Timeseries into Episodes

作者： Paul Cohen , Brent Heeringa , Niall M. Adams

关键词:

摘要: This paper describes an unsupervised algorithm for segmenting categorical time series into episodes. The Voting-Experts first collects statistics about the frequency and boundary entropy of ngrams, then passes a window over has two "expert methods" decide where in boundaries should be drawn. successfully segments text words four languages. also robot sensor data subsequences that represent episodes life robot. We claim VOTING-EXPERTS finds meaningful because it exploits statistical characteristics

参考文章(11)

Minos N. Garofalakis, Kyuseok Shim, Rajeev Rastogi, SPIRIT: Sequential Pattern Mining with Regular Expression Constraints very large data bases. pp. 223- 234 ,(1999)

Craig G Nevill-Manning, Ian H Witten, Compression and Explanation using Hierarchical Grammars The Computer Journal. ,vol. 40, pp. 103- 116 ,(1997) , 10.1093/COMJNL/40.2_AND_3.103

Michael R. Brent, An Efficient, Probabilistically Sound Algorithm for Segmentation andWord Discovery Machine Learning. ,vol. 34, pp. 71- 105 ,(1999) , 10.1023/A:1007541817488

C. G. Nevill-Manning, I. H. Witten, Identifying hierarchical structure in sequences: a linear-time algorithm Journal of Artificial Intelligence Research. ,vol. 7, pp. 67- 82 ,(1997) , 10.1613/JAIR.374

David M. Magerman, Mitchell P. Marcus, Parsing a natural language using mutual information statistics national conference on artificial intelligence. pp. 984- 989 ,(1990)

U.M. Feyyad, Data mining and knowledge discovery: making sense out of data IEEE Intelligent Systems. ,vol. 11, pp. 20- 25 ,(1996) , 10.1109/64.539013

Gary M. Weiss, Haym Hirsh, Learning to predict rare events in event sequences knowledge discovery and data mining. pp. 359- 363 ,(1998)

Heikki Mannila, Hannu Toivonen, A. Inkeri Verkamo, Discovery of Frequent Episodes in Event Sequences Data Mining and Knowledge Discovery. ,vol. 1, pp. 259- 289 ,(1997) , 10.1023/A:1009748302351

W. J. Teahan, Yingying Wen, Rodger McNab, Ian H. Witten, A compression-based algorithm for Chinese word segmentation Computational Linguistics. ,vol. 26, pp. 375- 393 ,(2000) , 10.1162/089120100561746

10.

Lillian Lee, Rie Kubota Ando, Mostly-unsupervised statistical segmentation of Japanese: applications to kanji north american chapter of the association for computational linguistics. pp. 241- 248 ,(2000)

An Unsupervised Algorithm for Segmenting Categorical Timeseries into Episodes

来源期刊

我的账户

An Unsupervised Algorithm for Segmenting Categorical Timeseries into Episodes

来源期刊

相似文章 10

我的账户