Exploiting Hidden Non-uniformity of Uniform Memory Access on Manycore CPUs

作者: Balazs Gerofi , Masamichi Takagi , Yutaka Ishikawa

DOI: 10.1007/978-3-319-14313-2_21

关键词: Computer scienceClock rateParallel computingXeon PhiUniform memory accessMemory controllerMulti-core processor

摘要: As the rate of CPU clock improvement has stalled for last decade, increased use parallelism in form multi- and many-core processors been chased to improve overall performance. Current high-end manycore CPUs already accommodate up hundreds processing cores. At same time, these architectures come with complex on-chip networks inter-core communication multiple memory controllers accessing off-chip RAM modules. Intel’s latest Many Integrated Cores (MIC) chip, also called Xeon Phi, boasts 60 cores (each 4-ways SMT) combined eight controllers. Although chip provides Uniform Memory Access (UMA), we find that there are substantial (as high as 60%) differences access latencies different blocks depending on which core issues request, resembling Non-Uniform (NUMA) architectures.

参考文章(27)
Nicolas Melot, Kenan Avdic, Jörg Keller, Christoph Kessler, Parallel sorting on Intel Single-Chip Cloud computer 3rd Many-core Applications Research Community, Ettlingen, July 5-6 2011, Germany. pp. 107- 110 ,(2011)
L. Ivanov, R. Nunna, Modeling and verification of cache coherence protocols international symposium on circuits and systems. ,vol. 5, pp. 129- 132 ,(2001) , 10.1109/ISCAS.2001.922002
John L. Hennessy, David A. Patterson, Computer Architecture: A Quantitative Approach ,(1989)
Luis Henrique Oliveira Rios, Luiz Chaimowicz, A survey and classification of A* based best-first heuristic search algorithms brazilian symposium on artificial intelligence. pp. 253- 262 ,(2010) , 10.1007/978-3-642-16138-4_26
Flavio Tonidandel, Rosa Maria Vicari, Antônio Carlos da Rocha Costa, Advances in Artificial Intelligence -- Sbia 2010 ,(2011)
Peter Hart, Nils Nilsson, Bertram Raphael, A Formal Basis for the Heuristic Determination of Minimum Cost Paths IEEE Transactions on Systems Science and Cybernetics. ,vol. 4, pp. 100- 107 ,(1968) , 10.1109/TSSC.1968.300136
R.P. LaRowe, C.S. Ellis, M.A. Holliday, Evaluation of NUMA memory management through modeling and measurements IEEE Transactions on Parallel and Distributed Systems. ,vol. 3, pp. 686- 701 ,(1992) , 10.1109/71.180624
Bratin Saha, Avi Mendelson, Xiaocheng Zhou, Hu Chen, Ying Gao, Shoumeng Yan, Mohan Rajagopalan, Jesse Fang, Peinan Zhang, Ronny Ronen, Programming model for a heterogeneous x86 platform Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation - PLDI '09. ,vol. 44, pp. 431- 440 ,(2009) , 10.1145/1542476.1542525
W. Bolosky, R. Fitzgerald, M. Scott, Simple but effective techniques for NUMA memory management symposium on operating systems principles. ,vol. 23, pp. 19- 31 ,(1989) , 10.1145/74850.74854