Adaptive Cache Compression for High-Performance Processors

作者： Alaa R. Alameldeen , David A. Wood

关键词: Cache invalidation 、 Cache coloring 、 Page cache 、 Cache-oblivious algorithm 、 Computer science 、 Cache pollution 、 Parallel computing 、 Memory architecture 、 CPU cache 、 Cache 、 Smart Cache 、 Write-once 、 Cache algorithms

摘要: Modern processors use two or more levels ofcache memories to bridge the rising disparity betweenprocessor and memory speeds. Compression canimprove cache performance by increasing effectivecache capacity eliminating misses. However,decompressing lines also increases accesslatency, potentially degrading performance.In this paper, we develop an adaptive policy thatdynamically adapts costs benefits of cachecompression. We propose a two-level hierarchywhere L1 holds uncompressed data L2cache dynamically selects between compressed anduncompressed storage. The L2 is 8-way set-associativewith LRU replacement, where each set can storeup eight but has space for only fouruncompressed lines. On reference, LRUstack depth size determine whethercompression (could have) eliminated miss incurs anunnecessary decompression overhead. Based on thisoutcome, updates single globalsaturating counter, which predicts whether allocatelines in form.We evaluate compression usingfull-system simulation range benchmarks. Weshow that improve formemory-intensive commercial workloads up 17%.However, always using hurts performancefor low-miss-rate benchmarks-due unnecessarydecompression overhead-degrading byup 18%. By monitoring workload behavior,the achieves comparable benefitsfrom compression, while never performanceby than 0.4%.

参考文章(44)

G. Hinton, The microarchitecture of the Pentium 4 processor Intel Technical Journal. ,vol. 1, ,(2001)

Fred Douglis, The Compression Cache: Using On-line Compression to Extend Physical Memory. USENIX Winter. pp. 519- 529 ,(1993)

Yannis Smaragdakis, Scott F. Kaplan, Paul R. Wilson, The case for compressed caching in virtual memory systems usenix annual technical conference. pp. 8- 8 ,(1999)

J.-S. Lee, W.-K. Hong, S.-D. Kim, Adaptive Methods to Minimize Decompression Overhead for Compressed On-Chip Caches International Journal of Computers and Applications. ,vol. 25, pp. 98- 105 ,(2003) , 10.1080/1206212X.2003.11441690

David Chen, Enoch Peserico, Larry Rudolph, None, A Dynamically Partitionable Compressed Cache Singapore-MIT Alliance Symposium 2003. ,(2003)

S. Jourdan, Tse-Hao Hsing, J. Stark, Y.N. Patt, The effects of mispredicted-path execution on branch prediction structures international conference on parallel architectures and compilation techniques. pp. 58- 67 ,(1996) , 10.1109/PACT.1996.552555

L. Benini, D. Bruni, B. Ricco, A. Macii, E. Macii, An adaptive data compression scheme for memory traffic minimization in processor-based systems international symposium on circuits and systems. ,vol. 4, pp. 866- 869 ,(2002) , 10.1109/ISCAS.2002.1010595

James H. Pomerene, Frank J. Sparacio, Rudolph N. Rechtschaffen, Thomas R. Puzak, Prefetching system for a cache having a second directory for sequentially accessed blocks ,(1984)

R. Schaller, Technological innovation in the semiconductor industry: A case study of the International Technology Roadmap for Semiconductors (ITRS) portland international conference on management of engineering and technology. ,vol. 1, pp. 195- ,(2001) , 10.1109/PICMET.2001.951917

10.

Erik G. Hallnor, Steven K. Reinhardt, A compressed memory hierarchy using an indirect index cache Proceedings of the 3rd workshop on Memory performance issues in conjunction with the 31st international symposium on computer architecture - WMPI '04. pp. 9- 15 ,(2004) , 10.1145/1054943.1054945

Adaptive Cache Compression for High-Performance Processors

来源期刊

我的账户

Adaptive Cache Compression for High-Performance Processors

来源期刊

相似文章 10

我的账户