作者: Douglas P. Ghormley , David Petrou , Steven H. Rodrigues , Amin M. Vahdat , Thomas E. Anderson
DOI: 10.1002/(SICI)1097-024X(19980725)28:9<929::AID-SPE183>3.0.CO;2-C
关键词:
摘要: Recent improvements in network and workstation performance have made clusters an attractive architecture for diverse workloads, including interactive sequential parallel applications. Although viable hardware solutions are available today, the largest challenge making such a cluster usable lies system software. This paper describes design implementation of GLUnix, operating middleware workstations. GLUnix was designed to provide transparent remote execution, support jobs, load ballancing, backward compatibility existing application binaries. constructed be easily portable number platforms. has been daily use over two half years is currently running on 100-node Sun UltraSPARCs. relates our experiences with designing, building, GLUnix. We discuss three important tradeoffs faced by any system, present reasons choices. Each these decisions then re-evaluated light both experience recent technological advancements. describe user-level, centralized, event-driven highlight aspects implementation. Performance scalability measurements indicate that user-level can scale gracefully significant sizes, incurring only additional 220 μs overhead per node execution. The discussion focuses successes failures we encountered while building maintaining characterization limitations various features were added satisfy user community. © 1998 John Wiley & Sons, Ltd.