Petascale system management experiences

作者: Cory Lueninghoener , William Scullin , Rick Bradshaw , Andrew Cherry , Susan Coghlan

DOI:

关键词:

摘要: Petascale HPC systems are among the largest in world. Intrepid, one such system, is a 40,000 node, 556 teraflop Blue Gene/P system that has been deployed at Argonne National Laboratory. In this paper, we provide some background about and our administration experiences. particular, due to scale of have faced variety issues, surprising us, not common commodity We discuss expectations, these approaches used address them.

参考文章(7)
Narayan Desai, Rick Bradshaw, Andrew Lusk, Ewing Lusk, MPI Cluster System Software Lecture Notes in Computer Science. pp. 277- 286 ,(2004) , 10.1007/978-3-540-30218-6_40
Chris McEniry, Moobi: a thin server management system using BitTorrent usenix large installation systems administration conference. pp. 253- 260 ,(2007)
John P. Rouillard, Real-time log file analysis using the Simple Event Correlator (SEC) usenix large installation systems administration conference. pp. 133- 150 ,(2004)
Jonathan Appavoo, Volkmar Uhlig, Amos Waterland, Project Kittyhawk: building a global-scale computer: Blue Gene/P as a generic computing platform Operating Systems Review. ,vol. 42, pp. 77- 84 ,(2008) , 10.1145/1341312.1341326
A. Gara, M. A. Blumrich, D. Chen, G. L.-T. Chiu, P. Coteus, M. E. Giampapa, R. A. Haring, P. Heidelberger, D. Hoenicke, G. V. Kopcsay, T. A. Liebsch, M. Ohmacht, B. D. Steinmacher-Burow, T. Takken, P. Vranas, Overview of the Blue Gene/L system architecture Ibm Journal of Research and Development. ,vol. 49, pp. 195- 212 ,(2005) , 10.1147/RD.492.0195
A.J. Oliner, R.K. Sahoo, J.E. Moreira, M. Gupta, Performance implications of periodic checkpointing on large-scale cluster systems international parallel and distributed processing symposium. pp. 299- ,(2005) , 10.1109/IPDPS.2005.337
John P. Rouillard, Refereed Papers: Real-time Log File Analysis Using the Simple Event Correlator (SEC) LISA '04 Proceedings of the 18th USENIX conference on System administration. pp. 133- 150 ,(2004)