作者: David Eklöv , Nikos Nikoleris , Erik Hagersten
DOI:
关键词:
摘要: To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with deep memory hierarchies including several levels of caches. For such microprocessors, both the off-chip typically about two orders magnitude worse than fastest on-chip cache. Consequently, performance many applications is largely determined by how well they utilize caches bandwidths in hierarchy. applications, there principal approaches improve performance: optimize hierarchy software. In cases, it important qualitatively quantitatively understand software utilizes interacts resources (e.g., cache bandwidths) hierarchy.This thesis presents novel profiling methods for memory-centric analysis. The goal these provide general, high-level, quantitative information describing profiled hierarchy, thereby help hardware developers identify opportunities related optimizations. techniques be broadly applicable data collection should have minimal impact on application, while not being dependent custom and/or operating system extensions. Furthermore, resulting accurate easy interpret.While use cases presented, main focus this design evaluation core methods. These measure estimate high-level metrics, as miss-and fetch ratio; demand; execution rate affected amount receive. This shows that can accurately obtained very little without requiring costly simulations or support.