Data summarization: a survey

作者: Mohiuddin Ahmed

DOI: 10.1007/S10115-018-1183-0

关键词: Unstructured dataKey (cryptography)Automatic summarizationInformation retrievalAnomaly detectionKnowledge extractionComputer science

摘要: Summarization has been proven to be a useful and effective technique supporting data analysis of large amounts data. Knowledge discovery from (KDD) is time consuming, summarization an important step expedite KDD tasks by intelligently reducing the size processed In this paper, different techniques for structured unstructured are discussed. The key finding survey that not all create summary suitable further analysis. It highlighted sampling viable way creating knowledge such as anomaly detection summary. Also evaluation metrics

参考文章(104)
Ameet Talwalkar, Ameet Talwalkar, Mehryar Mohri, Afshin Rostamizadeh, Afshin Rostamizadeh, Foundations of Machine Learning ,(2012)
Bidyut Kr. Patra, Sukumar Nandi, Tolerance rough set theory based data summarization for clustering large datasets Transactions on rough sets XIV. ,vol. 14, pp. 139- 158 ,(2011) , 10.1007/978-3-642-21563-6_8
Nebojsa Stefanovic, Yijun Lu, Jiawei Han, Yongjian Fu, Wan Gong, Krzysztof Koperski, Jenny Chiang, Osmar R. Zaiane, Betty Xia, Amynmohamed Rajan, Deyi Li, Wei Wang, DBMiner: a system for mining knowledge in large relational databases knowledge discovery and data mining. pp. 250- 255 ,(1996)
Federico Montesino Pouzols, Diego R. Lopez, Angel Barriga Barros, Summarization and Analysis of Network Traffic Flow Records Springer, Berlin, Heidelberg. ,vol. 2011, pp. 147- 189 ,(2011) , 10.1007/978-3-642-18084-2_4
Charu C. Aggarwal, Philip S. Yu, A Survey of Synopsis Construction in Data Streams Data Streams - Models and Algorithms. pp. 169- 207 ,(2007) , 10.1007/978-0-387-47534-9_9
Sanghoon Lee, Saeid Belkasim, Yanqing Zhang, Multi-document text summarization using topic model and fuzzy logic machine learning and data mining in pattern recognition. pp. 159- 168 ,(2013) , 10.1007/978-3-642-39712-7_12
Rajeev Motwani, Liadan O'Callaghan, Mayur Datar, Brian Babcock, Sliding Window Computations over Data Streams Stanford InfoLab. ,(2002)
Quang-Khai Pham, Time Sequence Summarization: Theory and Applications Université de Nantes. ,(2010)
Zhinoos Razavi Hesabi, Zahir Tari, Andrzej Goscinski, Adil Fahad, Ibrahim Khalil, Carlos Queiroz, None, Data Summarization Techniques for Big Data—A Survey Handbook on data centers. pp. 1109- 1152 ,(2015) , 10.1007/978-1-4939-2092-1_38
Moses Charikar, Kevin Chen, Martin Farach-Colton, Finding Frequent Items in Data Streams international colloquium on automata languages and programming. ,vol. 312, pp. 693- 703 ,(2002) , 10.1016/S0304-3975(03)00400-6