Using Subfiling to Improve Programming Flexibility and Performance of Parallel Shared-file I/O

作者: Kui Gao , Wei-keng Liao , Arifa Nisar , Alok Choudhary , Robert Ross

DOI: 10.1109/ICPP.2009.68

关键词:

摘要: There are two popular parallel I/O programming styles used by modern scientific computational applications: unique-file and shared-file. Unique-file usually gives satisfactory performance, but its major drawback is that managing a large number of files can overwhelm the task post-simulation data processing. Shared-file produces fewer allows arrays partitioned among processes to be saved in canonical order. As processors on machines increases into thousands more, problem size turn global array also increase proportionally. It not practical manage each larger than few hundreds GB. Hence, seek middle ground between these styles, we propose subfiling scheme divides multi-dimensional smaller subarrays, file, named subfile. Subfiling implemented top MPI-IO. We incorporate it netCDF library order preserve partitioning information file header, so later reconstructed. In addition, since decreases sharing reduce overhead system's consistency control. Our experimental results with several benchmarks show provide improved performance.

参考文章(16)
R. Thakur, E. Lusk, W. Gropp, Users guide for ROMIO: A high-performance, portable MPI-IO implementation Mathematics in Computer Science. ,(1997) , 10.2172/564273
Xiaosong Ma, Marianne Winslett, Jonghyun Lee, Shengke Yu, None, Improving MPI-IO output performance with active buffering plus threads international parallel and distributed processing symposium. pp. 68- ,(2003) , 10.1109/IPDPS.2003.1213165
James G. Letwin, High performance file system ,(1990)
William Gropp, Jianwei Li, Robert B. Ross, Alok N. Choudhary, Wei-keng Liao, Robert Latham, Rajeev Thakur, Parallel netCDF: A Scientific High-Performance I/O Interface arXiv: Distributed, Parallel, and Cluster Computing. ,(2003)
Wei-keng Liao, K. Coloma, A. Choudhary, L. Ward, E. Russell, S. Tideman, Collective caching: application-aware client-side file caching high performance distributed computing. pp. 81- 90 ,(2005) , 10.1109/HPDC.2005.1520940
David Kotz, Disk-directed I/O for MIMD multiprocessors ACM Transactions on Computer Systems. ,vol. 15, pp. 41- 74 ,(1997) , 10.1145/244764.244766
B. Fryxell, K. Olson, P. Ricker, F. X. Timmes, M. Zingale, D. Q. Lamb, P. MacNeice, R. Rosner, J. W. Truran, H. Tufo, Flash: An adaptive mesh hydrodynamics code for modeling astrophysical thermonuclear flashes Astrophysical Journal Supplement Series. ,vol. 131, pp. 273- 334 ,(2000) , 10.1086/317361
Ramanan Sankaran, Evatt R Hawkes, Jacqueline H Chen, Tianfeng Lu, Chung K Law, Direct Numerical Simulations of Turbulent Lean Premixed Combustion. Journal of Physics: Conference Series. ,vol. 46, pp. 38- 42 ,(2006) , 10.1088/1742-6596/46/1/004
Rajeev Thakur, Alok Choudhary, An extended two-phase method for accessing sections of out-of-core arrays Scientific Programming. ,vol. 5, pp. 301- 317 ,(1996) , 10.1155/1996/547186
Juan Miguel del Rosario, Rajesh Bordawekar, Alok Choudhary, Improved parallel I/O via a two-phase run-time access strategy ACM SIGARCH Computer Architecture News. ,vol. 21, pp. 31- 38 ,(1993) , 10.1145/165660.165667