Enabling Knowledge Discovery in a Virtual Universe

作者: Jeffrey P. Gardner , Andrew Connolly , Cameron McBride

DOI:

关键词: Data scienceDistributed computingNtropyMassively parallelComputer scienceTeraGridScalabilityKnowledge extractionTree (data structure)Task (computing)Range (computer programming)

摘要: Abstract S Virtual observatories will give astronomers easy access to an unprecedented amount of data. Extracting scientic knowl-edge from these data increasingly demand both efcient algorithms as well the power parallel computers. Such machineswill range in size small Beowulf clusters large massively platforms (MPPs) collections MPPs distributed across aGrid, such NSF TeraGrid facility. Nearly all analyses astronomical datasets use trees their fundamental datastructure. Writing tree-based techniques, a task that is time-consuming even on single-processor computers, exceedinglycumbersome or grid-distributed resources. We have developed library, Ntropy, provides e xible, extensible, andeasy-to-use way developing analysis for serial and platforms. Our experience has shownthat not only does our library save development time, it also delivers increase performance. Furthermore, Ntropy makes iteasy astronomer with little no programming quickly scale application multiproces-sor environment. By minimizing time scalable analysis, we enable wide-scale knowledge discoveryon massive datasets.

参考文章(10)
Yossi Shiloach, Uzi Vishkin, An O(logn) parallel connectivity algorithm Journal of Algorithms. ,vol. 3, pp. 57- 67 ,(1982) , 10.1016/0196-6774(82)90008-6
I. Kayo, I. Kayo, Neta Bahcall, I. Szapudi, I. Csabai, R. H. Wechsler, I. Zehavi, I. Zehavi, A. Pope, A. Pope, F. Marin, D. Schneider, B. Jain, R. C. Nichol, R. C. Nichol, J. Brinkmann, M. Blanton, J. Schneider, A. W. Moore, R. K. Sheth, A. J. Gray, A. Szalay, J. Pun, J. Pun, G. Kulkarni, A. J. Connolly, Y. Suto, J. P. Gardner, C. J. Miller, The Effect of Large-Scale Structure on the SDSS Galaxy Three-Point Correlation Function Monthly Notices of the Royal Astronomical Society. ,vol. 368, pp. 1507- 1514 ,(2006) , 10.1111/J.1365-2966.2006.10239.X
Andreas Müller, Roland Rühl, Extending high performance Fortran for the support of unstructured computations Proceedings of the 9th international conference on Supercomputing - ICS '95. pp. 127- 136 ,(1995) , 10.1145/224538.224552
John Reid, Robert W. Numrich, Co-arrays in the next Fortran Standard Scientific Programming. ,vol. 15, pp. 9- 26 ,(2007) , 10.1155/2007/954503
Laxmikant V. Kale, Sanjeev Krishnan, CHARM++ Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications - OOPSLA '93. ,vol. 28, pp. 91- 108 ,(1993) , 10.1145/165854.165874
Steven Saunders, Lawrence Rauchwerger, ARMI: an adaptive, platform independent communication library acm sigplan symposium on principles and practice of parallel programming. ,vol. 38, pp. 230- 241 ,(2003) , 10.1145/781498.781534
Robert W. Numrich, John Reid, Co-arrays in the next Fortran Standard ACM SIGPLAN Fortran Forum. ,vol. 24, pp. 4- 17 ,(2005) , 10.1145/1080399.1080400
Robert W. Numrich, John Reid, Co-array Fortran for parallel programming ACM Sigplan Fortran Forum. ,vol. 17, pp. 1- 31 ,(1998) , 10.1145/289918.289920
James C. Phillips, Rosemary Braun, Wei Wang, James Gumbart, Emad Tajkhorshid, Elizabeth Villa, Christophe Chipot, Robert D. Skeel, Laxmikant Kalé, Klaus Schulten, Scalable molecular dynamics with NAMD Journal of Computational Chemistry. ,vol. 26, pp. 1781- 1802 ,(2005) , 10.1002/JCC.20289
Orion S. Lawlor, Laxmikant V. Kalé, Supporting dynamic parallel object arrays Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande. pp. 21- 28 ,(2001) , 10.1145/376656.376804