Data processing for large database using feature selection

作者: Nikat Parveen , Ananthi M

DOI: 10.1109/ICCCT2.2017.7972294

关键词:

摘要: Big data is a term for huge amount of sets that are becoming more complex processing applications and inadequate to deal with them. need set techniques technologies which can easily handle the be processed. Feature selection have become an apparent in many identify required information from large data. Existing systems processes same repeatedly each time when user request submitted even small task. In this work, feature technique using spark streaming proposed get live stream input process it batches extract given The method will use dataset based on requirements. algorithm also help increase throughput system.

参考文章(13)
K.S. Ray, T.K. Dinda, Pattern classification using fuzzy relational calculus systems man and cybernetics. ,vol. 33, pp. 1- 16 ,(2003) , 10.1109/TSMCB.2002.804361
Yuefeng Li, Abdulmohsen Algarni, Mubarak Albathan, Yan Shen, Moch Arif Bijaksana, Relevance Feature Discovery for Text Mining IEEE Transactions on Knowledge and Data Engineering. ,vol. 27, pp. 1656- 1669 ,(2015) , 10.1109/TKDE.2014.2373357
Jie Xu, Dingxiong Deng, Ugur Demiryurek, Cyrus Shahabi, Mihaela van der Schaar, Mining the Situation: Spatiotemporal Traffic Prediction With Big Data IEEE Journal of Selected Topics in Signal Processing. ,vol. 9, pp. 702- 715 ,(2015) , 10.1109/JSTSP.2015.2389196
Zhou Zhao, Xiaofei He, Deng Cai, Lijun Zhang, Wilfred Ng, Yueting Zhuang, Graph Regularized Feature Selection with Data Reconstruction IEEE Transactions on Knowledge and Data Engineering. ,vol. 28, pp. 689- 700 ,(2016) , 10.1109/TKDE.2015.2493537
Simon Fong, Raymond Wong, Athanasios Vasilakos, Accelerated PSO Swarm Search Feature Selection for Data Stream Mining Big Data IEEE Transactions on Services Computing. ,vol. 9, pp. 33- 45 ,(2016) , 10.1109/TSC.2015.2439695
Cristina Soguero-Ruiz, Kristian Hindberg, Jose Luis Rojo-Alvarez, Stein Olav Skrovseth, Fred Godtliebsen, Kim Mortensen, Arthur Revhaug, Rolv-Ole Lindsetmo, Knut Magne Augestad, Robert Jenssen, Support Vector Feature Selection for Early Detection of Anastomosis Leakage From Bag-of-Words in Electronic Health Records IEEE Journal of Biomedical and Health Informatics. ,vol. 20, pp. 1404- 1415 ,(2016) , 10.1109/JBHI.2014.2361688
Jun Zhu, Eric Zhuang, Jian Fu, John Baranowski, Andrew Ford, James Shen, A Framework-Based Approach to Utility Big Data Analytics IEEE Transactions on Power Systems. ,vol. 31, pp. 2455- 2462 ,(2016) , 10.1109/TPWRS.2015.2462775
Liang Wang, Sotiris Tasoulis, Teemu Roos, Jussi Kangasharju, Kvasir: Scalable Provision of Semantically Relevant Web Content on Big Data Framework IEEE Transactions on Big Data. ,vol. 2, pp. 219- 233 ,(2016) , 10.1109/TBDATA.2016.2557348
Jun Chin Ang, Andri Mirzal, Habibollah Haron, Haza Nuzly Abdull Hamed, Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection IEEE/ACM Transactions on Computational Biology and Bioinformatics. ,vol. 13, pp. 971- 989 ,(2016) , 10.1109/TCBB.2015.2478454