作者: Balazs Gerofi , Masamichi Takagi , Yutaka Ishikawa
DOI: 10.1007/978-3-319-14313-2_21
关键词: Computer science 、 Clock rate 、 Parallel computing 、 Xeon Phi 、 Uniform memory access 、 Memory controller 、 Multi-core processor
摘要: As the rate of CPU clock improvement has stalled for last decade, increased use parallelism in form multi- and many-core processors been chased to improve overall performance. Current high-end manycore CPUs already accommodate up hundreds processing cores. At same time, these architectures come with complex on-chip networks inter-core communication multiple memory controllers accessing off-chip RAM modules. Intel’s latest Many Integrated Cores (MIC) chip, also called Xeon Phi, boasts 60 cores (each 4-ways SMT) combined eight controllers. Although chip provides Uniform Memory Access (UMA), we find that there are substantial (as high as 60%) differences access latencies different blocks depending on which core issues request, resembling Non-Uniform (NUMA) architectures.