作者: Ping Luo , Kevin Lü , Rui Huang , Qing He , Zhongzhi Shi
DOI: 10.1111/J.1468-0394.2006.00408.X
关键词:
摘要: The computing-intensive data mining (DM) process calls for the support of a heterogeneous computing system, which consists multiple computers with different configurations connected by high-speed large-area network increased computational power and resources. DM can be described as multi-phase pipeline process, in each phase there could many optional methods. This makes workflow very complex it modeled only directed acyclic graph (DAG). A system needs an effective efficient scheduling framework, orchestrates all hardware to perform competitive workflows. Motivated need practical solution problem workflow, this paper proposes dynamic DAG algorithm according characteristics execution time estimation model jobs. Based on approximate job time, first maps jobs machines decentralized diligent (defined paper) manner. Then performance initial mapping improved through migrations when necessary. heuristic used considers factors both minimal completion criterion critical path DAG. We implement established multi-agent environment, reuse existing algorithms is achieved encapsulating them into agents. evaluation its usage oil well logging analysis are also discussed.