作者: Mateo Valero , Kamil Kedzierski , Miquel Moreto , Francisco J. Cazorla
DOI: 10.1109/IPDPS.2010.5470352
关键词:
摘要: Recent studies have shown that cache partitioning is an efficient technique to improve throughput, fairness and Quality of Service (QoS) in CMP processors. The algorithms proposed so far assume Least Recently Used (LRU) as the underlying replacement policy. However, it has been true LRU imposes extraordinary complexity area overheads when implemented on high associativity caches, such last level caches. As a consequence, current processors available market use pseudo-LRU policies, which provide similar behavior LRU, while reducing hardware complexity. Thus, presented LRU-based solutions cannot be applied real architectures. This paper proposes complete system for caches using In particular, focuses implementations by Sun Microsystems IBM, called Not (NRU) Binary Tree (BT), respectively. We propose accuracy profiling logic both schemes. evaluate our proposals' costs terms power, compare them against algorithm. Overall, this presents two techniques adapt existing policies. results show impose negligible performance degradation with respect LRU.