Dissecting on-node memory access performance: a semantic approach

作者: Alfredo Gimenez , Todd Gamblin , Barry Rountree , Abhinav Bhatele , Ilir Jusufi

DOI: 10.1109/SC.2014.19

关键词:

摘要: Optimizing memory access is critical for performance and power efficiency. CPU manufacturers have developed sampling-based measurement units (PMUs) that report precise costs of accesses at specific addresses. However, this data too low-level to be meaningfully interpreted contains an excessive amount irrelevant or uninteresting information. We a method gather fine-grained objects regions code with low overhead attribute semantic information the sampled accesses. This provides context necessary more effectively interpret data. tool performs sampling attribution used discover diagnose problems in real-world applications. Our techniques provide useful insight into behaviour applications allow programmers understand ramifications key design decisions: domain decomposition, multi-threading, motion within distributed systems.

参考文章(22)
Christine Deane, George Ho, Phil Mucci, Shirley Browne, PAPI: A Portable Interface to Hardware Performance Counters hpcmp users group conference. ,(1999)
Vivien Quéma, Baptiste Lepers, Renaud Lachaize, MemProf: a memory profiler for NUMA multicore systems usenix annual technical conference. pp. 5- 5 ,(2012)
Gerhard Wellein, Georg Hager, Jan Treibig, LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments international conference on parallel processing. pp. 207- 216 ,(2010) , 10.1109/ICPPW.2010.38
Franois Broquedis, Jerome Clet-Ortega, Stephanie Moreaud, Nathalie Furmento, Brice Goglin, Guillaume Mercier, Samuel Thibault, Raymond Namyst, hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications parallel, distributed and network-based processing. pp. 180- 186 ,(2010) , 10.1109/PDP.2010.67
Nick Rutar, Jeffrey K. Hollingsworth, Software techniques for negating skid and approximating cache miss measurements parallel computing. ,vol. 39, pp. 120- 131 ,(2013) , 10.1016/J.PARCO.2012.09.004
Wm. A. Wulf, Sally A. McKee, Hitting the memory wall ACM SIGARCH Computer Architecture News. ,vol. 23, pp. 20- 24 ,(1995) , 10.1145/216585.216588
R. Bruce Irvin, Barton P. Miller, Mapping performance data for high-level and data views of parallel program performance international conference on supercomputing. pp. 69- 77 ,(1996) , 10.1145/237578.237587
Alfred Inselberg, Bernard Dimsdale, Parallel coordinates: a tool for visualizing multi-dimensional geometry ieee visualization. pp. 361- 378 ,(1990) , 10.5555/949531.949588
Ted Selker, Larry Carter, Bowen Alpern, Visualizing computer memory architectures ieee visualization. pp. 107- 113 ,(1990) , 10.5555/949531.949548
Nick Rutar, Jeffrey K. Hollingsworth, Data centric techniques for mapping performance data to program variables parallel computing. ,vol. 38, pp. 2- 14 ,(2012) , 10.1016/J.PARCO.2011.10.006