作者: Vijay Karamcheti , John Plevyak , Andrew A. Chien
关键词:
摘要: High performance on distributed memory machines for programming models with dynamic thread creation and multithreading requires efficient management communication. Traditional runtimes, consisting of few general-purpose, bundled mechanisms that assume minimal compiler hardware support, are suitable computations involving coarse-grained threads but provide low efficiency in the presence small granularity irregular communication behavior. We describe two Illinois Concert runtime system which address this shortcoming. The first,hybrid stack-heap execution,exploits close coupling to dynamically form execution units; lazily created as required by situations. second,pull messaging,exploits support implement a message queue receiver-initiated data transfer, delivering robust across wide range characteristics. measure their impact based Cray T3D implementation system. Individually, increase absolute up 50%. Together, they feasible space computations, enabling compute granularities an order magnitude smaller. Performance results large applications demonstrate expressing programs using need not compromise performance.