作者: E. Berg , H. Zeffer , E. Hagersten
DOI: 10.1109/ISPASS.2006.1620793
关键词:
摘要: The introduction of general-purpose microprocessors running multiple threads will put a focus on methods and tools helping programmer to write efficient parallel applications. Such tool should be fast enough meet software developer's need for short turn-around time, but also accurate flexible provide trend-correct intuitive feedback. This paper presents novel sample-based method analyzing the data locality multithreaded application. Very sparse is collected during single execution studied architectural-independent information fed mathematical memory-system model predicting cache miss ratio. can used characterize application's with respect almost any possible memory system, such as complicated multiprocessor multilevel hierarchies. Any combination size, cache-line size degree sharing modeled. Each modeled design point takes only fraction second evaluate, even though application from which sampled was may have executed hours. makes not just usable developers, hardware developers who evaluate huge space. accuracy evaluated using large number commercial technical multi-threaded result produced by algorithm shown consistent results traditional (and much slower) architecture simulation.