Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities

作者: Sarah Cohen-Boulakia , Khalid Belhajjame , Olivier Collin , Jérôme Chopard , Christine Froidevaux

DOI: 10.1016/J.FUTURE.2017.01.012

关键词: Computer scienceDomain (software engineering)Context (language use)Set (psychology)Data scienceReproducibilityWorkflowUse case

摘要: With the development of new experimental technologies, biologists are faced with an avalanche data to be computationally analyzed for scientific advancements and discoveries emerge. Faced complexity analysis pipelines, large number computational tools, enormous amount manage, there is compelling evidence that many if not most will stand test time: increasing reproducibility computed results paramount importance. The objective we set out in this paper place workflows context reproducibility. To do so, define several kinds repro-ducibility can reached when used perform experiments. We characterize criteria need catered by reproducibility-friendly workflow systems, use such representative widely systems companion tools within a framework. also discuss remaining challenges posed reproducible life sciences. Our study was guided three cases from science domain involving silico

参考文章(83)
Leonardo Murta, Vanessa Braganholo, Fernando Chirigati, David Koop, Juliana Freire, noWorkflow: Capturing and Analyzing Provenance of Scripts Lecture Notes in Computer Science. pp. 71- 83 ,(2015) , 10.1007/978-3-319-16462-5_6
Leonard P. Freedman, Iain M. Cockburn, Timothy S. Simcoe, The Economics of Reproducibility in Preclinical Research PLOS Biology. ,vol. 13, pp. 1- 9 ,(2015) , 10.1371/JOURNAL.PBIO.1002165
Johannes Starlinger, Sarah Cohen-Boulakia, Sanjeev Khanna, Susan B. Davidson, Ulf Leser, Effective and efficient similarity search in scientific workflow repositories Future Generation Computer Systems. ,vol. 56, pp. 584- 594 ,(2016) , 10.1016/J.FUTURE.2015.06.012
Christina L. Zheng, Varun Ratnakar, Yolanda Gil, Shannon K. McWeeney, Use of semantic workflows to enhance transparency and reproducibility in clinical omics Genome Medicine. ,vol. 7, pp. 73- 73 ,(2015) , 10.1186/S13073-015-0202-Y
Victoria Stodden, Friedrich Leisch, Roger D Peng, None, Implementing reproducible research Journal of Empirical Research on Human Research Ethics. ,vol. 10, ,(2014) , 10.1201/B16868
Juliana Freire, Cláudio T. Silva, Steven P. Callahan, Emanuele Santos, Carlos E. Scheidegger, Huy T. Vo, Managing Rapidly-Evolving Scientific Workflows Provenance and Annotation of Data. pp. 10- 18 ,(2006) , 10.1007/11890850_2
Carl Boettiger, An introduction to Docker for reproducible research Operating Systems Review. ,vol. 49, pp. 71- 79 ,(2015) , 10.1145/2723872.2723882
Johannes Starlinger, Bryan Brancotte, Sarah Cohen-Boulakia, Ulf Leser, Similarity search for scientific workflows Proceedings of the VLDB Endowment. ,vol. 7, pp. 1143- 1154 ,(2014) , 10.14778/2732977.2732988
Fernando Chirigati, Dennis Shasha, Juliana Freire, ReproZip: using provenance to support computational reproducibility Proceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance. pp. 1- 1 ,(2013) , 10.5555/2482949.2482951
Paolo Missier, Khalid Belhajjame, Jun Zhao, Marco Roos, Carole Goble, Data Lineage Model for Taverna Workflows with Lightweight Annotation Requirements international provenance and annotation workshop. pp. 17- 30 ,(2008) , 10.1007/978-3-540-89965-5_4