Comprehensive job level resource usage measurement and analysis for XSEDE HPC systems

作者: Charng-Da Lu , James Browne , Robert L. DeLeon , John Hammond , William Barth

DOI: 10.1145/2484762.2484781

关键词:

摘要: This paper presents a methodology for comprehensive job level resource use measurement and analysis applications of the analyses to planning HPC systems case study application XSEDE Ranger Lonestar4 at University Texas. The steps in are: System-wide collection performance statistics node levels, mapping storage resultant job-wise data relational database which eases further implementation transformation formats required by specific statistical analytical algorithms. Analyses can be carried out different levels granularity: job, user, or system-wide basis. Measurements are based on novel lightweight job-centric tool "TACC_Stats" [1], gathers set metrics all compute nodes. tools will an extension XDMoD project [2] community. also reports preliminary results from measured Texas Advanced Computing Center's supercomputers. studies presented indicate detailed information that available resources when TACC_Stats is deployed throughout system. applied any system runs tool.

参考文章(11)
Markus Geimer, Pavel Saviankou, Alexandre Strube, Zoltán Szebenyi, Felix Wolf, Brian J. N. Wylie, Further improving the scalability of the scalasca toolset parallel computing. pp. 463- 473 ,(2010) , 10.1007/978-3-642-28145-7_45
David W. Scott, Multivariate Density Estimation Wiley Series in Probability and Statistics. ,(1992) , 10.1002/9780470316849
M. Valiev, E.J. Bylaska, N. Govind, K. Kowalski, T.P. Straatsma, H.J.J. Van Dam, D. Wang, J. Nieplocha, E. Apra, T.L. Windus, W.A. de Jong, NWChem: a comprehensive and scalable open-source solution for large scale molecular simulations Computer Physics Communications. ,vol. 181, pp. 1477- 1489 ,(2010) , 10.1016/J.CPC.2010.04.018
Kevin A. Huck, Allen D. Malony, Sameer Shende, Alan Morris, Knowledge support and automation for performance analysis with PerfExplorer 2.0 Scientific Programming. ,vol. 16, pp. 123- 134 ,(2008) , 10.1155/2008/985194
Davide Del Vento, Thomas Engel, Siddhartha S. Ghosh, David L. Hart, Rory Kelly, Si Liu, Richard Valent, System-level monitoring of floating-point performance to improve effective system utilization ieee international conference on high performance computing data and analytics. pp. 5- ,(2011) , 10.1145/2063348.2063355
Martin Burtscher, Byoung-Do Kim, Jeff Diamond, John McCalpin, Lars Koesterke, James Browne, PerfExpert: An Easy-to-Use Performance Diagnosis Tool for HPC Applications ieee international conference on high performance computing data and analytics. pp. 1- 11 ,(2010) , 10.1109/SC.2010.41
N Tallent, J Mellor-Crummey, L Adhianto, M Fagan, M Krentel, HPCToolkit: performance tools for scientific computing Journal of Physics: Conference Series. ,vol. 125, pp. 012088- ,(2008) , 10.1088/1742-6596/125/1/012088
Bernard R Brooks, Charles L Brooks III, Alexander D Mackerell Jr, Lennart Nilsson, Robert J Petrella, Benoît Roux, Youngdo Won, Georgios Archontis, Christian Bartels, Stefan Boresch, Amedeo Caflisch, L Caves, Qiang Cui, Aaron R Dinner, Michael Feig, S Fischer, Jiali Gao, Milan Hodoscek, Wonpil Im, Krzysztof Kuczera, Themis Lazaridis, J Ma, Victor Ovchinnikov, Emanuele Paci, Richard W Pastor, Carol Beth Post, JZ Pu, Michael Schaefer, Bruce Tidor, Richard M Venable, H Lee Woodcock, Xiongwu Wu, Wei Yang, Darrin M York, Martin Karplus, None, CHARMM: the biomolecular simulation program. Journal of Computational Chemistry. ,vol. 30, pp. 1545- 1614 ,(2009) , 10.1002/JCC.21287
Sameer S. Shende, Allen D. Malony, The Tau Parallel Performance System ieee international conference on high performance computing data and analytics. ,vol. 20, pp. 287- 311 ,(2006) , 10.1177/1094342006064482
José M Soler, Emilio Artacho, Julian D Gale, Alberto García, Javier Junquera, Pablo Ordejón, Daniel Sánchez-Portal, The SIESTA method for ab initio order-N materials simulation Journal of Physics: Condensed Matter. ,vol. 14, pp. 2745- 2779 ,(2002) , 10.1088/0953-8984/14/11/302