作者: Samantika Subramaniam , Simon C. Steely , Will Hasenplaugh , Aamer Jaleel , Carl Beckmann
关键词:
摘要: As microprocessor designs integrate more cores, scalability of cache coherence protocols becomes a challenging problem. Most directory-based avoid races by using blocking tag directories that can impact the performance parallel applications. In this article, we first quantitatively demonstrate state-of-the-art significantly constrain throughput at large core counts for several Nonblocking address concern expense in interconnection network or required resource overheads. To concern, enhance nonblocking directory migrating point service responses. Our approach uses in-flight chains cores making memory requests to incorporate while maintaining high-throughput. The proposed protocol called chained coherence, outperform up 20p on scientific and 12p commercial It also has low overheads simple ordering requirements it both high-performance scalable protocol. Furthermore, provide solution building hierarchical as well optimize communication latencies.