A View from ORNL: Scientific Data Research Opportunities in the Big Data Age

作者： Scott Klasky , Matthew Wolf , Mark Ainsworth , Chuck Atkins , Jong Choi

DOI: 10.1109/ICDCS.2018.00136

关键词: Data science 、 Development plan 、 Scalability 、 Software 、 SPARK (programming language) 、 Big data 、 Computer science 、 Workflow 、 Data visualization 、 Data modeling

摘要: One of the core issues across computer and computational science today is adapting to, managing, learning from influx "Big Data". In commercial space, this problem has led to a huge investment in new technologies capabilities that are well adapted dealing with sorts human-generated logs, videos, texts, other large-data artifacts processed resulted an explosion useful platforms languages (Hadoop, Spark, Pandas, etc.). However, translating work enterprise space HPC community proven somewhat difficult, part because some fundamental differences type scale data timescales surrounding its generation use. We describe forward-looking research development plan which centers around concept making Input/Output (I/O) intelligent for users scientific community, whether they accessing scalable storage or performing situ workflow tasks. Much our based on experience Adaptable I/O System (ADIOS 1.X), next version software ADIOS 2.X [1].

uni-trier.de 本地加速

researchwithrutgers.com 本地加速

doi.org 本地加速

sci-hub.se PDF 下载加速

参考文章(43)

Justin J. Miller, Graph Database Applications and Concepts with Neo4j ,(2013)

Ilkay Altintas, Chad Berkley, Edward A. Lee, Efrat Jaeger, Bertram Ludäscher, Matthew Jones, Jing Tao, Yang Zhao, Dan Higgins, Scientific workflow management and the Kepler system: Research Articles Concurrency and Computation: Practice and Experience. ,vol. 18, pp. 1039- 1065 ,(2006) , 10.1002/CPE.V18:10

Jai Dayal, Jay Lofstead, Greg Eisenhauer, Karsten Schwan, Matthew Wolf, Hasan Abbasi, Scott Klasky, SODA: Science-Driven Orchestration of Data Analytics 2015 IEEE 11th International Conference on e-Science. pp. 475- 484 ,(2015) , 10.1109/ESCIENCE.2015.59

Qing Liu, Jeremy Logan, Yuan Tian, Hasan Abbasi, Norbert Podhorszki, Jong Youl Choi, Scott Klasky, Roselyne Tchoua, Jay Lofstead, Ron Oldfield, Manish Parashar, Nagiza Samatova, Karsten Schwan, Arie Shoshani, Matthew Wolf, Kesheng Wu, Weikuan Yu, Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks Concurrency and Computation: Practice and Experience. ,vol. 26, pp. 1453- 1473 ,(2014) , 10.1002/CPE.3125

Lipeng Wan, Zheng Lu, Qing Cao, Feiyi Wang, Sarp Oral, Bradley Settlemyer, SSD-optimized workload placement with adaptive learning and classification in HPC environments ieee conference on mass storage systems and technologies. pp. 1- 6 ,(2014) , 10.1109/MSST.2014.6855552

Torsten Hoefler, Marc Snir, Generic topology mapping strategies for large-scale parallel architectures Proceedings of the international conference on Supercomputing - ICS '11. pp. 75- 84 ,(2011) , 10.1145/1995896.1995909

C. S. Chang, S. Ku, P. H. Diamond, Z. Lin, S. Parker, T. S. Hahm, N. Samatova, Compressed ion temperature gradient turbulence in diverted tokamak edge Physics of Plasmas. ,vol. 16, pp. 056108- 056108 ,(2009) , 10.1063/1.3099329

W. Dorland, F. Jenko, M. Kotschenreuther, B. N. Rogers, Electron temperature gradient turbulence. Physical Review Letters. ,vol. 85, pp. 5579- 5582 ,(2000) , 10.1103/PHYSREVLETT.85.5579

Philip Carns, Robert Latham, Robert Ross, Kamil Iskra, Samuel Lang, Katherine Riley, None, 24/7 Characterization of petascale I/O workloads international conference on cluster computing. pp. 1- 10 ,(2009) , 10.1109/CLUSTR.2009.5289150

10.

Fang Zheng, Hongfeng Yu, Can Hantas, Matthew Wolf, Greg Eisenhauer, Karsten Schwan, Hasan Abbasi, Scott Klasky, GoldRush: resource efficient in situ scientific data analytics using fine-grained interference aware execution ieee international conference on high performance computing data and analytics. pp. 78- ,(2013) , 10.1145/2503210.2503279

A View from ORNL: Scientific Data Research Opportunities in the Big Data Age

来源期刊

我的账户

A View from ORNL: Scientific Data Research Opportunities in the Big Data Age

来源期刊

相似文章 6

A Vision for Managing Extreme-Scale Data Hoards

The Case for a Common Instrumentation Interface for HPC Codes

ADIOS 2: The Adaptable Input Output System. A framework for high-performance data management

Visualization as a Service for Scientific Data.

Scalable Data-Intensive Geocomputation: A Design for Real-Time Continental Flood Inundation Mapping.

Data Federation Challenges in Remote Near-Real-Time Fusion Experiment Data Processing.

我的账户