Methodology and application of HPC I/O characterization with MPIProf and IOT

作者: Yan-Tyng Sherry Chang , John Bauer , Henry Jin

DOI: 10.5555/3018823.3018824

关键词: SupercomputerLustre (file system)Internet of ThingsSpeedupParallel computingDevelopment environmentHeavy loadServerComputer science

摘要: Combining the strengths of MPIProf and IOT, an efficient systematic method is devised for I/O characterization at per-job, per-rank, per-file per-call levels programs running on high-performance computing resources NASA Advanced Supercomputing (NAS) facility. This applied to four questions in this paper. A total 13 MPI 15 cases, ranging from 24 5968 ranks, are analyzed establish landscape answers questions. Four use I/O, behavior their collective writes depends specific implementation library used. The SGI MPT library, prevailing NAS systems, was found automatically gather small a large number ranks order perform larger by subset buffering ranks. invoked Lustre stripe count nodes used run. demonstration varying achieve double-digit speedup one program's presented. Another program, which concurrently opens private files all could potentially create heavy load servers, identified. ability systematically characterize supercomputer, seek optimization opportunity, identify that cause high instability filesystems important pursuing exascale real production environment.

参考文章(8)
Fengfeng Pan, Yinliang Yue, Jin Xiong, Daxiang Hao, I/O Characterization of Big Data Workloads in Data Centers Big Data Benchmarks, Performance Optimization, and Emerging Hardware. pp. 85- 97 ,(2014) , 10.1007/978-3-319-13021-7_7
Philip C. Roth, Characterizing the I/O behavior of scientific applications on the Cray XT Proceedings of the 2nd international workshop on Petascale data storage held in conjunction with Supercomputing '07 - PDSW '07. pp. 50- 55 ,(2007) , 10.1145/1374596.1374609
Philip Carns, Robert Latham, Robert Ross, Kamil Iskra, Samuel Lang, Katherine Riley, None, 24/7 Characterization of petascale I/O workloads international conference on cluster computing. pp. 1- 10 ,(2009) , 10.1109/CLUSTR.2009.5289150
Daniel Thomas, Jean-Pierre Panziera, John Baron, MPInside: a performance analysis and diagnostic tool for MPI applications workshop on software and performance. pp. 79- 86 ,(2010) , 10.1145/1712605.1712620
Subhash Saini, Jason Rappleye, Johnny Chang, David Barker, Piyush Mehrotra, Rupak Biswas, I/O performance characterization of Lustre and NASA applications on Pleiades ieee international conference on high performance computing, data, and analytics. pp. 1- 10 ,(2012) , 10.1109/HIPC.2012.6507507
Sameer S. Shende, Allen D. Malony, The Tau Parallel Performance System ieee international conference on high performance computing data and analytics. ,vol. 20, pp. 287- 311 ,(2006) , 10.1177/1094342006064482
Wenrui Dong, Guangming Liu, Jie Yu, You Zuo, Characterizing I/O workloads of HPC applications through online analysis international performance computing and communications conference. pp. 1- 2 ,(2015) , 10.1109/PCCC.2015.7410353
Philip Schwan, Peter J. Braam, Lustre: The intergalactic file system ,(2002)