An efficient and general implementation of futures on large scale shared-memory multiprocessors

作者: Marc Feeley

DOI:

关键词:

摘要: This thesis describes a high-performance implementation technique for Multilisp's "future" parallelism construct. method addresses the non-uniform memory access (NUMA) problem inherent in large scale shared-memory multiprocessors. The is based on lazy task creation (LTC), dynamic partitioning mechanism that dramatically reduces cost of and consequently makes it possible to exploit fine grain parallelism. In LTC, idle processors get work do by "stealing" tasks from other processors. A previously proposed LTC (SM) protocol. main disadvantage SM protocol requires stack be cached suboptimally cache-incoherent machines. proposes new allows full caching stack: message-passing (MP) Idle ask sending "work request" messages After receiving such message processor checks its private queue sends back if one available. passing has added benefits lower simpler algorithms. Extensive experiments evaluate performance both protocols multiprocessors: 90 GP1000 32 TC2000. results show MP consistently better than difference as high factor two when cache available 1.2 not addition, shows semantics Multilisp language does have impoverished attain good performance. laziness can exploited support at virtually no several programming features including: Katz-Weise continuation with legitimacy, scoping, fairness.

参考文章(0)