作者: M. V. Sukanya , Shiju Sathyadevan , U. B. Unmesha Sreeveni
DOI: 10.1007/978-3-319-11218-3_22
关键词:
摘要: Data management becomes a complex task when hundreds of petabytes data are being gathered, stored and processed on day to basis. Efficient processing the exponentially growing is inevitable in this context. This paper discusses about huge amount through Support Vector machine (SVM) algorithm using different techniques ranging from single node Linier implementation parallel distributed frameworks like Hadoop. Map-Reduce component Hadoop performs parallelization process which used feed information Machines (SVMs), learning applicable classification regression analysis. Paper also does detailed anatomy SVM sets roadmap for implementing same both linear fashion. The main objective explain detail steps involved developing an scratch standard conduct performance analysis across SVM, Hadoop, cluster against proven tool R, gauging them with respect accuracy achieved, their pace varying sizes, capability handle volume without breaking etc.