Towards a Scalable Distributed Fitness Evaluation Service

关键词: Symbolic regression 、 Bottleneck 、 Evolutionary programming 、 Distributed computing 、 Service (systems architecture) 、 Resource (project management) 、 Speedup 、 Implementation 、 Spark (mathematics) 、 Computer science

摘要: Organizations across the globe gather more and data. Large datasets require new approaches to analysis processing, which include methods based on machine learning. In particular, symbolic regression can provide many useful insights. Unfortunately, due high resource requirements, use of this method for large might be unfeasible. paper we analyze a bottleneck in an open-source implementation method, call hubert. We identify that evaluation individuals is most costly operation. As solution problem, propose service Apache Spark framework, attempts speed up computations by distributing them cluster machines. compare performance analyzing execution time number samples with both implementations. Then discuss how computation improves increased amount resources. Finally draw conclusions outline plans further research.

springer.com 本地加速

uni-trier.de 本地加速

springer.com 本地加速

doi.org 本地加速

springer.com LINK 下载加速

sci-hub.st HTML 下载加速

参考文章(12)

J. Evans, A. Rzhetsky, Philosophy of science. Machine science. Science. ,vol. 329, pp. 399- 400 ,(2010) , 10.1126/SCIENCE.1189416

Wlodzimierz Funika, Pawel Koperek, Genetic Programming in Automatic Discovery of Relationships in Computer System Monitoring Data international conference on parallel processing. pp. 371- 380 ,(2013) , 10.1007/978-3-642-55224-3_35

Włodzimierz Funika, Paweł Koperek, Mateusz Kupisz, TOWARDS AUTONOMIC SEMANTIC-BASED MANAGEMENT OF DISTRIBUTED APPLICATIONS Computer Science. ,vol. 11, pp. 51- 51 ,(2010) , 10.7494/CSCI.2010.11.0.51

John R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection ,(1992)

Michael Schmidt, Hod Lipson, Distilling Free-Form Natural Laws from Experimental Data Science. ,vol. 324, pp. 81- 85 ,(2009) , 10.1126/SCIENCE.1165893

Michael D. Schmidt, Hod Lipson, Data-Mining Dynamical Systems: Automated Symbolic System Identification for Exploratory Analysis Volume 2: Automotive Systems; Bioengineering and Biomedical Technology; Computational Mechanics; Controls; Dynamical Systems. pp. 643- 649 ,(2008) , 10.1115/ESDA2008-59309

A. Salhi, H. Glaser, D. De Roure, Parallel implementation of a genetic-programming based tool for symbolic regression Information Processing Letters. ,vol. 66, pp. 299- 307 ,(1998) , 10.1016/S0020-0190(98)00056-8

Michael D. Schmidt, Hod Lipson, Age-fitness pareto optimization genetic and evolutionary computation conference. pp. 543- 544 ,(2010) , 10.1145/1830483.1830584

Xin Du, Youcong Ni, Zhiqiang Yao, Ruliang Xiao, Datong Xie, High performance parallel evolutionary algorithm model based on MapReduce framework Journal of Computer Applications in Technology. ,vol. 46, pp. 290- 295 ,(2013) , 10.1504/IJCAT.2013.052807

10.

Ross D. King, Jem Rowland, Stephen G. Oliver, Michael Young, Wayne Aubrey, Emma Byrne, Maria Liakata, Magdalena Markham, Pinar Pir, Larisa N. Soldatova, Andrew Sparkes, Kenneth E. Whelan, Amanda Clare, The Automation of Science Science. ,vol. 324, pp. 85- 89 ,(2009) , 10.1126/SCIENCE.1165620

Towards a Scalable Distributed Fitness Evaluation Service

来源期刊

我的账户

Towards a Scalable Distributed Fitness Evaluation Service

来源期刊

相似文章 4

Evolutionary Induction of Classification Trees on Spark

A modern, event-based architecture for distributed evolutionary algorithms

What Are the Limits of Evolutionary Induction of Decision Trees

Parallel and Distributed Computation

我的账户