Towards a Scalable Distributed Fitness Evaluation Service

作者: Włodzimierz Funika , Paweł Koperek

DOI: 10.1007/978-3-319-32149-3_46

关键词: Symbolic regressionBottleneckEvolutionary programmingDistributed computingService (systems architecture)Resource (project management)SpeedupImplementationSpark (mathematics)Computer science

摘要: Organizations across the globe gather more and data. Large datasets require new approaches to analysis processing, which include methods based on machine learning. In particular, symbolic regression can provide many useful insights. Unfortunately, due high resource requirements, use of this method for large might be unfeasible. paper we analyze a bottleneck in an open-source implementation method, call hubert. We identify that evaluation individuals is most costly operation. As solution problem, propose service Apache Spark framework, attempts speed up computations by distributing them cluster machines. compare performance analyzing execution time number samples with both implementations. Then discuss how computation improves increased amount resources. Finally draw conclusions outline plans further research.

参考文章(12)
J. Evans, A. Rzhetsky, Philosophy of science. Machine science. Science. ,vol. 329, pp. 399- 400 ,(2010) , 10.1126/SCIENCE.1189416
Wlodzimierz Funika, Pawel Koperek, Genetic Programming in Automatic Discovery of Relationships in Computer System Monitoring Data international conference on parallel processing. pp. 371- 380 ,(2013) , 10.1007/978-3-642-55224-3_35
Włodzimierz Funika, Paweł Koperek, Mateusz Kupisz, TOWARDS AUTONOMIC SEMANTIC-BASED MANAGEMENT OF DISTRIBUTED APPLICATIONS Computer Science. ,vol. 11, pp. 51- 51 ,(2010) , 10.7494/CSCI.2010.11.0.51
Michael Schmidt, Hod Lipson, Distilling Free-Form Natural Laws from Experimental Data Science. ,vol. 324, pp. 81- 85 ,(2009) , 10.1126/SCIENCE.1165893
Michael D. Schmidt, Hod Lipson, Data-Mining Dynamical Systems: Automated Symbolic System Identification for Exploratory Analysis Volume 2: Automotive Systems; Bioengineering and Biomedical Technology; Computational Mechanics; Controls; Dynamical Systems. pp. 643- 649 ,(2008) , 10.1115/ESDA2008-59309
A. Salhi, H. Glaser, D. De Roure, Parallel implementation of a genetic-programming based tool for symbolic regression Information Processing Letters. ,vol. 66, pp. 299- 307 ,(1998) , 10.1016/S0020-0190(98)00056-8
Michael D. Schmidt, Hod Lipson, Age-fitness pareto optimization genetic and evolutionary computation conference. pp. 543- 544 ,(2010) , 10.1145/1830483.1830584
Xin Du, Youcong Ni, Zhiqiang Yao, Ruliang Xiao, Datong Xie, High performance parallel evolutionary algorithm model based on MapReduce framework Journal of Computer Applications in Technology. ,vol. 46, pp. 290- 295 ,(2013) , 10.1504/IJCAT.2013.052807
Ross D. King, Jem Rowland, Stephen G. Oliver, Michael Young, Wayne Aubrey, Emma Byrne, Maria Liakata, Magdalena Markham, Pinar Pir, Larisa N. Soldatova, Andrew Sparkes, Kenneth E. Whelan, Amanda Clare, The Automation of Science Science. ,vol. 324, pp. 85- 89 ,(2009) , 10.1126/SCIENCE.1165620