A Heterogeneous Computing System for Data Mining Workflows

作者: Ping Luo , Kevin Lü , Qing He , Zhongzhi Shi

DOI: 10.1007/11788911_15

关键词:

摘要: The computing-intensive Data Mining (DM) process calls for the support of a Heterogeneous Computing (HC) system, which consists multiple computers with different configurations, connected by high-speed LAN, increased computational power and resources. DM can be described as multi-phase pipeline process, in each phase there could many optional methods. This makes workflow very complex modelled only Directed Acyclic Graph (DAG). An HC system needs an effective efficient scheduling framework, orchestrates all computing hardware to perform competitive workflows. Motivated need practical solution problem workflow, this paper proposes dynamic DAG algorithm according characteristics execution time estimation model jobs. Based on approximate job time, first maps jobs machines decentralized diligent (defined paper) manner. Then performance initial mapping improved through migrations when necessary. heuristic used it considers factors both minimal completion criterion critical path DAG. We implement established Multi-Agent System (MAS) environment, reuse existing algorithms is achieved encapsulating them into agents. Practical classification problems are test measure performance. detailed experiment procedure result analysis also discussed paper.

参考文章(13)
Knowledge discovery in databases : pkdd 2005 Published in <b>2005</b> in New York NY) by Springer. ,(2005) , 10.1007/11564126
Domenico Talia, Paolo Trunfio, Oreste Verta, Weka4WS: A WSRF-Enabled Weka Toolkit for Distributed Data Mining on Grids Knowledge Discovery in Databases: PKDD 2005. pp. 309- 320 ,(2005) , 10.1007/11564126_32
Zhongzhi Shi, Qiujian Sheng, Zhikung Zhao, Yuncheng Jiang, Yong Cheng, Haijun Zhang, MAGE: An Agent-Oriented Programming Environment ieee international conference on cognitive informatics. pp. 250- 257 ,(2004) , 10.1109/ICCI.2004.20
M. Iverson, F. Ozguner, Dynamic, competitive scheduling of multiple DAGs in a distributed heterogeneous environment Proceedings Seventh Heterogeneous Computing Workshop (HCW'98). pp. 70- 78 ,(1998) , 10.1109/HCW.1998.666546
A.S. Ali, O.F. Rana, I.J. Taylor, Web services composition for distributed data mining international conference on parallel processing. pp. 11- 18 ,(2005) , 10.1109/ICPPW.2005.87
Tracy D Braun, Howard Jay Siegel, Noah Beck, Ladislau L Bölöni, Muthucumaru Maheswaran, Albert I Reuther, James P Robertson, Mitchell D Theys, Bin Yao, Debra Hensgen, Richard F Freund, A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems Journal of Parallel and Distributed Computing. ,vol. 61, pp. 810- 837 ,(2001) , 10.1006/JPDC.2000.1714
Mario Cannataro, Domenico Talia, The knowledge grid Communications of the ACM. ,vol. 46, pp. 89- 93 ,(2003) , 10.1145/602421.602425
R. Sakellariou, Henan Zhao, A hybrid heuristic for DAG scheduling on heterogeneous systems international parallel and distributed processing symposium. pp. 111- 123 ,(2004) , 10.1109/IPDPS.2004.1303065
Ping Luo, Kevin Lü, Rui Huang, Qing He, Zhongzhi Shi, A heterogeneous computing system for data mining workflows in multi-agent environments Expert Systems. ,vol. 23, pp. 258- 272 ,(2006) , 10.1111/J.1468-0394.2006.00408.X