作者: Kathryn S. McKinley , Zhenlin Wang , Arnold L. Rosenberg , Charles C. Weems
关键词:
摘要: Memory performance is increasingly determining microprocessor and technology trends are exacerbating this problem. Most architectures use set-associative caches with LRU replacement policies to combine fast access relatively low miss rates. To improve decisions in caches, we develop a new set of compiler algorithms that predict which data will not be reused provide these hints the architecture. We prove either match or hit rates over LRU. describe practical one-bit cache-line tag implementation our algorithm, called evict-me. On cache replacement, architecture replace line for evict-me bit set, if none it bits. implement analysis its output Scale compiler. variety scientific programs, using algorithm both level 1 2 improves simulated cycle times by up 34% policy increasing In addition, combination simple hardware prefetching works together further performance.