作者: Richard Michael Shok , David L Darrington , Matthew W Markland , Philip James Sanders
DOI:
关键词: Computing systems 、 Host (network) 、 Computation 、 Node (networking) 、 Kernel (statistics) 、 Architecture 、 Computer science 、 Parallel computing
摘要: A method, apparatus, and program product checkpoint an application in a parallel computing system of the type that includes plurality hybrid nodes. Each node host element accelerator elements. may include at least one multithreaded processor, each multi-element processor. In first from among nodes, checkpointing executing portion element, configuring computation kernel and, response to receiving command application, separately upon which is executing.