The curse of indecomposable aggregates for big data exploratory analysis with a case for frequent pattern cubes

作者: Hamid Fadishei , Azadeh Soltani

DOI: 10.1007/S11227-019-03053-8

关键词: Computer scienceExploratory data analysisIndecomposable moduleTheoretical computer scienceBig dataExploratory analysisData cubeCubeCurseHardware and ArchitectureSoftwareInformation Systems

摘要: Exploratory big data analytics requires the interaction delays to be kept at minimum. Although cubes help this goal by pre-calculating measures of interest, some aggregations are not decomposable and require runtime scans through cube which will cause response time exceed real-time limits. One such costly is calculation frequent patterns over partitions. The existing inefficient merge-and-count approach used for solving problem feasible in world data. In paper, an efficient proposed mining from accompanied a formal overview indecomposable aggregates. A new concept semi-decomposable aggregates introduced that sits between these two extremes. With case pattern problem, we show sometimes fact exploratory analysis can still realized them. FPCubes algorithm shows promising experimental results aggregating itemset on real-world multidimensional datasets.

参考文章(30)
Srinivasan Parthasarathy, Mitsunori Ogihara, Mohammed J Zaki, Wei Li, New algorithms for fast discovery of association rules knowledge discovery and data mining. pp. 283- 286 ,(1997)
Jiawei Han, OLAP Mining: An Integration of OLAP with Data Mining Data Mining and Reverse Engineering. pp. 3- 20 ,(1998) , 10.1007/978-0-387-35300-5_1
Rodrigo Salvador Monteiro, Geraldo Zimbrão, Holger Schwarz, Bernhard Mitschang, Jano Moreira de Souza, Building the data warehouse of frequent itemsets in the DWFIST approach international syposium on methodologies for intelligent systems. pp. 294- 303 ,(2005) , 10.1007/11425274_31
Riadh Ben Messaoud, Omar Boussaid, Sabine Rabaseda, Mining Association Rules in OLAP Cubes international conference on innovations in information technology. pp. 1- 5 ,(2006) , 10.1109/INNOVATIONS.2006.301947
Zhicheng Liu, Jeffrey Heer, The Effects of Interactive Latency on Exploratory Visual Analysis IEEE Transactions on Visualization and Computer Graphics. ,vol. 20, pp. 2122- 2131 ,(2014) , 10.1109/TVCG.2014.2346452
Harish Kumar Shakya, Bhaskar Biswas, Kuldeep Singh, An Efficient Approach to Discovering Frequent Patterns from Data Cube using Aggregation and Directed Graph international conference on computer and communication technology. pp. 31- 35 ,(2015) , 10.1145/2818567.2818573
Lauro Lins, James T. Klosowski, Carlos Scheidegger, Nanocubes for Real-Time Exploration of Spatiotemporal Datasets IEEE Transactions on Visualization and Computer Graphics. ,vol. 19, pp. 2456- 2465 ,(2013) , 10.1109/TVCG.2013.179
Yixin Chen, Guozhu Dong, Jiawei Han, Jian Pei, Benjamin W. Wah, Jianyong Wang, Regression Cubes with Lossless Compression and Aggregation IEEE Transactions on Knowledge and Data Engineering. ,vol. 18, pp. 1585- 1599 ,(2006) , 10.1109/TKDE.2006.196
Micheline Kamber, Jiawei Han, Jenny Y. Chiang, Metarule-guided mining of multi-dimensional association rules using data cubes knowledge discovery and data mining. pp. 207- 210 ,(1997)
Jim Gray, Surajit Chaudhuri, Adam Bosworth, Andrew Layman, Don Reichart, Murali Venkatrao, Frank Pellow, Hamid Pirahesh, Data cube: a relational aggregation operator generalizing GROUP-BY, CROSS-TAB, and SUB-TOTALS international conference on data engineering. ,vol. 1, pp. 555- 567 ,(1996) , 10.1023/A:1009726021843