Efficient algorithms for frequent pattern mining in many-task computing environments

作者: Kawuu W. Lin , Yu-Chin Lo

DOI: 10.1016/J.KNOSYS.2013.04.004

关键词:

摘要: The goal of data mining is to discover hidden useful information in large databases. Mining frequent patterns from transaction databases an important problem mining. As the database size increases, computation time and required memory also increase. Because number items user behaviours become more complex. To solve increasing complexity, many researchers have applied parallel distributed computing techniques discovery amounts data. However, most studies focused on improving performance for a single task neglected many-task issue, which current cloud-computing environments. In these environments, application often provided as service, e.g., Google search engine, implying that users can use it simultaneously. this paper, we propose set algorithms, containing Equal Working Set (EWS) algorithm, Request On Demand (ROD) Small Size (SSWS) algorithm Progressive (PSWS) pattern provides fast scalable service Through empirical evaluations various simulation conditions, proposed algorithms are shown deliver excellent with respect scalability execution time.

参考文章(24)
Ramakrishnan Srikant, Rakesh Agrawal, Fast algorithms for mining association rules very large data bases. pp. 580- 592 ,(1998)
Ramakrishnan Srikant, Rakesh Agrawal, Fast Algorithms for Mining Association Rules in Large Databases very large data bases. pp. 487- 499 ,(1994)
María S. Pérez, Alberto Sánchez, Víctor Robles, Pilar Herrero, José M. Peña, Design and implementation of a data mining grid-aware architecture grid computing environments. ,vol. 23, pp. 42- 47 ,(2007) , 10.1016/J.FUTURE.2006.04.008
Antonio Congiusta, Domenico Talia, Paolo Trunfio, Service-oriented middleware for distributed data mining on the grid Journal of Parallel and Distributed Computing. ,vol. 68, pp. 3- 15 ,(2008) , 10.1016/J.JPDC.2007.07.007
Chih-Hung Wu, Chih-Chin Lai, Yu-Chieh Lo, An empirical study on mining sequential patterns in a grid computing environment Expert Systems With Applications. ,vol. 39, pp. 5748- 5757 ,(2012) , 10.1016/J.ESWA.2011.11.095
Eui-Hong Han, G. Karypis, V. Kumar, Scalable parallel data mining for association rules IEEE Transactions on Knowledge and Data Engineering. ,vol. 12, pp. 337- 352 ,(2000) , 10.1109/69.846289
Asif Javed, Ashfaq Khokhar, Frequent Pattern Mining on Message Passing Multiprocessor Systems Distributed and Parallel Databases. ,vol. 16, pp. 321- 334 ,(2004) , 10.1023/B:DAPD.0000031634.19130.BD
Kun-Ming Yu, Jiayi Zhou, Parallel TID-based frequent pattern mining algorithm on a PC Cluster and grid computing system Expert Systems With Applications. ,vol. 37, pp. 2486- 2494 ,(2010) , 10.1016/J.ESWA.2009.07.072
Yang Lai, Shi ZhongZhi, An Efficient Data Mining Framework on Hadoop using Java Persistence API 2010 10th IEEE International Conference on Computer and Information Technology. pp. 203- 209 ,(2010) , 10.1109/CIT.2010.71