A statistical multiprocessor cache model

作者: E. Berg , H. Zeffer , E. Hagersten

DOI: 10.1109/ISPASS.2006.1620793

关键词:

摘要: The introduction of general-purpose microprocessors running multiple threads will put a focus on methods and tools helping programmer to write efficient parallel applications. Such tool should be fast enough meet software developer's need for short turn-around time, but also accurate flexible provide trend-correct intuitive feedback. This paper presents novel sample-based method analyzing the data locality multithreaded application. Very sparse is collected during single execution studied architectural-independent information fed mathematical memory-system model predicting cache miss ratio. can used characterize application's with respect almost any possible memory system, such as complicated multiprocessor multilevel hierarchies. Any combination size, cache-line size degree sharing modeled. Each modeled design point takes only fraction second evaluate, even though application from which sampled was may have executed hours. makes not just usable developers, hardware developers who evaluate huge space. accuracy evaluated using large number commercial technical multi-threaded result produced by algorithm shown consistent results traditional (and much slower) architecture simulation.

参考文章(20)
X. Vera, Jingling Xue, Let's study whole-program cache behaviour analytically high-performance computer architecture. pp. 175- 186 ,(2002) , 10.1109/HPCA.2002.995708
Kristof Beyls, Erik H. D’Hollander, Frederik Vandeputte, RDVIS: A Tool that Visualizes the Causes of Low Locality and Hints Program Optimizations Lecture Notes in Computer Science. ,vol. 3515, pp. 166- 173 ,(2005) , 10.1007/11428848_21
Kristof Beyls, Yijun Yu, Erik H. D'Hollander, Visualization enables the programmer to reduce cache misses iasted international conference on parallel and distributed computing and systems. pp. 781- 786 ,(2002)
Erik Berg, Erik Hagersten, Fast data-locality profiling of native execution measurement and modeling of computer systems. ,vol. 33, pp. 169- 180 ,(2005) , 10.1145/1064212.1064232
S. Laha, J.H. Patel, R.K. Iyer, Accurate low-cost methods for performance evaluation of cache memory systems IEEE Transactions on Computers. ,vol. 37, pp. 1325- 1336 ,(1988) , 10.1109/12.8699
Mendel Rosenblum, Edouard Bugnion, Scott Devine, Stephen A. Herrod, Using the SimOS machine simulator to study complex computer systems ACM Transactions on Modeling and Computer Simulation. ,vol. 7, pp. 78- 103 ,(1997) , 10.1145/244804.244807
Erez Perelman, Greg Hamerly, Michael Van Biesbrouck, Timothy Sherwood, Brad Calder, Using SimPoint for accurate and efficient simulation measurement and modeling of computer systems. ,vol. 31, pp. 318- 319 ,(2003) , 10.1145/781027.781076
Calin CaΒcaval, David A. Padua, Estimating cache misses and locality using stack distances international conference on supercomputing. pp. 150- 159 ,(2003) , 10.1145/782814.782836
David A. Wood, Mark D. Hill, R. E. Kessler, A model for estimating trace-sample miss ratios measurement and modeling of computer systems. ,vol. 19, pp. 79- 89 ,(1991) , 10.1145/107971.107981
T.M. Conte, M.A. Hirsch, W.-M.W. Hwu, Combining trace sampling with single pass methods for efficient cache simulation IEEE Transactions on Computers. ,vol. 47, pp. 714- 720 ,(1998) , 10.1109/12.689650