Mining interestingness measures for string pattern mining

作者: Rafael Morales-Bueno , Manuel Baena-García

DOI: 10.5555/1945758.1945824

关键词:

摘要: In this paper we present a novel method to detect interesting patterns in strings. A common way refine results of pattern mining algorithms is using interestingness measures. But the set appropiate measures different each domain and problem. The aim our research obtain model that classify by interest. based on application machine learning generated dataset from factors features. Each row associated factor string contains values contextual information. We also propose new measure an entropy principle which improves obtained classification results. proposed avoids experts having configure parameters order patterns. demonstrated utility giving example real data. datasets scripts reproduce experiments are available on-line.

参考文章(24)
Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Shalom Tsur, Dynamic itemset counting and implication rules for market basket data international conference on management of data. ,vol. 26, pp. 255- 264 ,(1997) , 10.1145/253260.253325
Jacob Cohen, A Coefficient of agreement for nominal Scales Educational and Psychological Measurement. ,vol. 20, pp. 37- 46 ,(1960) , 10.1177/001316446002000104
Willi Klösgen, Explora: a multipattern and multistrategy discovery assistant knowledge discovery and data mining. pp. 249- 271 ,(1996)
Jiawei Han, Hong Cheng, Dong Xin, Xifeng Yan, Frequent pattern mining: current status and future directions Data Mining and Knowledge Discovery. ,vol. 15, pp. 55- 86 ,(2007) , 10.1007/S10618-006-0059-1
Jiawei Han, Jian Pei, Yiwen Yin, Runying Mao, Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach Data Mining and Knowledge Discovery. ,vol. 8, pp. 53- 87 ,(2004) , 10.1023/B:DAMI.0000005258.31418.83
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten, The WEKA data mining software ACM SIGKDD Explorations Newsletter. ,vol. 11, pp. 10- 18 ,(2009) , 10.1145/1656274.1656278
Fernando Berzal, Ignacio Blanco, Daniel Sánchez, María-Amparo Vila, Measuring the accuracy and interest of association rules: A new framework intelligent data analysis. ,vol. 6, pp. 221- 235 ,(2002) , 10.3233/IDA-2002-6303
Rakesh Agrawal, Tomasz Imieliński, Arun Swami, Mining association rules between sets of items in large databases Proceedings of the 1993 ACM SIGMOD international conference on Management of data - SIGMOD '93. ,vol. 22, pp. 207- 216 ,(1993) , 10.1145/170035.170072
Sergey Brin, Rajeev Motwani, Craig Silverstein, Beyond market baskets: generalizing association rules to correlations international conference on management of data. ,vol. 26, pp. 265- 276 ,(1997) , 10.1145/253260.253327
G. Udny Yule, On the Methods of Measuring Association between Two Attributes Journal of the Royal Statistical Society. ,vol. 75, pp. 579- 642 ,(1912) , 10.1111/J.2397-2335.1912.TB00463.X