Checkpointing a hybrid architecture computing system

作者: Richard Michael Shok , David L Darrington , Matthew W Markland , Philip James Sanders

DOI:

关键词: Computing systemsHost (network)ComputationNode (networking)Kernel (statistics)ArchitectureComputer scienceParallel computing

摘要: A method, apparatus, and program product checkpoint an application in a parallel computing system of the type that includes plurality hybrid nodes. Each node host element accelerator elements. may include at least one multithreaded processor, each multi-element processor. In first from among nodes, checkpointing executing portion element, configuring computation kernel and, response to receiving command application, separately upon which is executing.

参考文章(16)
Shvetima Gulati, Ashwani Wason, Michael Oliver Neary, Fabrice Ferval, Method and system for providing transparent incremental and multiprocess checkpointing to computer applications ,(2005)
David Ernest Lackey, Thomas Andrew Stranko, Dennis Frank Ackerman, David Randa Bender, Gary Griffen Hallock, Salina Sau-Yue Chu, George Robert Deibert, Robert George Sheldon, A logic simulation using a hardware accelerator together with an automated error event isolation and trace facility ,(1991)
Brian Bailey, Jeffry A. Jones, Devon J. Kehoe, Synchronization of multiple simulation domains in an EDA simulation environment ,(2002)
Gilles Gervais, Rajat Chaudhry, Danny J. Klema, Sang H. Dhong, Method for performing power simulations on complex designs running complex software applications ,(2006)