Characterizing fault tolerance in genetic programming

作者: Daniel Lombraña González , Francisco Fernández de Vega , Henri Casanova

DOI: 10.1145/1555284.1555286

关键词: Simple (abstract algebra)Evolutionary algorithmGenetic programmingRecovery techniquesParallel computingComputationDistributed computingFault toleranceComputer science

摘要: Evolutionary Algorithms (EAs), and particularly Genetic Programming (GP), are techniques frequently employed to solve difficult real-life problems, which can require up days or months of computation. One approach reduce the time solution is use parallel computing on distributed platforms. Distributed platforms prone failures, when these large and/or low-cost, failures expected events rather than catastrophic exceptions. Therefore, fault tolerance recovery often become necessary. It turns out that Parallel GP (PGP) applications have an inherent ability tolerate failures. This quantified via simulation experiments performed using failure traces from real-world platforms, namely, desktop grids (DGs), for two well-known problems. A simple technique then proposed by PGP better different, high, rates seen in different

参考文章(50)
Jack J. Dongarra, Thara Angskun, George Bosilca, Jelena Pjesivac-Grbovic, Graham E. Fagg, Kevin London, Edgar Gabriel, Zhizhong Chen, Extending the MPI Specification for Process Fault Tolerance on High Performance Computing Systems ,(2004)
G. Olague, Ben Segal, L. Trujillo, F. Fernández de Vega, D. Lombraña González, Customizable execution environments with virtual desktop grid computing iasted international conference on parallel and distributed computing and systems. pp. 7- 12 ,(2007)
Gianluigi Folino, Clara Pizzuti, Giandomenico Spezzano, CAGE: A Tool for Parallel Genetic Programming Applications european conference on genetic programming. pp. 64- 73 ,(2001) , 10.1007/3-540-45355-5_6
Francisco Fernündez de Vega, Grupo de Evolución Artificial, None, Parallel genetic programming Handbook of Bioinspired Algorithms and Applications. pp. 127- 153 ,(2005) , 10.1002/0471739383.CH6
Franck Cappello, Derrick Kondo, Gilles Fedak, Andrew Chien, Henri Casanova, Resource Availability in Enterprise Desktop Grids ,(2006)
B. Schroeder, G.A. Gibson, A large-scale study of failures in high-performance computing systems dependable systems and networks. pp. 249- 258 ,(2006) , 10.1109/DSN.2006.5
Jim Pruyne, Miron Livny, Managing Checkpoints for Parallel Programs job scheduling strategies for parallel processing. pp. 140- 154 ,(1996) , 10.1007/BFB0022292