Managing Complex Workflows in Bioinformatics: An Interactive Toolkit With GPU Acceleration

作者: Anuradha Welivita , Indika Perera , Dulani Meedeniya , Anuradha Wickramarachchi , Vijini Mallawaarachchi

DOI: 10.1109/TNB.2018.2837122

关键词:

摘要: Bioinformatics research continues to advance at an increasing scale with the help of techniques such as next-generation sequencing and availability tool support automate bioinformatics processes. With this growth, a large amount biological data gets accumulated unprecedented rate, demanding high-performance high-throughput computing technologies for processing datasets. Use hardware accelerators, graphics units (GPUs) distributed computing, accelerates big in environments. They enable higher degrees parallelism be achieved, thereby throughput. In paper, we introduce BioWorkflow, interactive workflow management system analyses capability scheduling parallel tasks use GPU-accelerated computing. This paper describes case study carried out evaluate performance complex branching executed by BioWorkflow. The results indicate gains $\times 2.89$ magnitude utilizing GPUs speed average 2.832$ (over $n = 5$ scenarios) execution graph nodes during multiple sequence alignment calculations. Combined speed-ups are achieved 1.71$ times workflows. confirms expected when having through GPU-acceleration concurrent than mainstream sequential execution. also provides comprehensive user interface better interactivity managing workflows; usability score 82.9 is confirmed high system.

参考文章(29)
Julie D. Thompson, Toby. J. Gibson, Des G. Higgins, Multiple Sequence Alignment Using ClustalW and ClustalX Current protocols in human genetics. ,(2003) , 10.1002/0471250953.BI0203S00
Erez Zadok, Andrew Lih, PGMAKE: A Portable Distributed Make System Department of Computer Science, Columbia University. ,(1994) , 10.7916/D8BK1MG3
Ezio Bartocci, Flavio Corradini, Emanuela Merelli, Lorenzo Scortichini, BioWMS: a web-based Workflow Management System for bioinformatics. BMC Bioinformatics. ,vol. 8, pp. 1- 14 ,(2007) , 10.1186/1471-2105-8-S1-S2
Rodrigo Gouveia-Oliveira, Peter W Sackett, Anders G Pedersen, MaxAlign: maximizing usable data in an alignment BMC Bioinformatics. ,vol. 8, pp. 312- 312 ,(2007) , 10.1186/1471-2105-8-312
Katherine Wolstencroft, Robert Haines, Donal Fellows, Alan Williams, David Withers, Stuart Owen, Stian Soiland-Reyes, Ian Dunlop, Aleksandra Nenadic, Paul Fisher, Jiten Bhagat, Khalid Belhajjame, Finn Bacall, Alex Hardisty, Abraham Nieva de la Hidalga, Maria P. Balcazar Vargas, Shoaib Sufi, Carole Goble, The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud Nucleic Acids Research. ,vol. 41, pp. 557- 561 ,(2013) , 10.1093/NAR/GKT328
Peter Duckert, Søren Brunak, Nikolaj Blom, Prediction of proprotein convertase cleavage sites Protein Engineering Design & Selection. ,vol. 17, pp. 107- 112 ,(2004) , 10.1093/PROTEIN/GZH013
Jacek Blazewicz, Wojciech Frohmberg, Michal Kierzynka, Pawel Wojciechowski, G-MSA - A GPU-based, fast and accurate algorithm for multiple sequence alignment Journal of Parallel and Distributed Computing. ,vol. 73, pp. 32- 41 ,(2013) , 10.1016/J.JPDC.2012.04.004
Nikolaj Blom, Steen Gammeltoft, Søren Brunak, Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. Journal of Molecular Biology. ,vol. 294, pp. 1351- 1362 ,(1999) , 10.1006/JMBI.1999.3310
Roman Valls Guimera, bcbio-nextgen: Automated, distributed next-gen sequencing pipeline EMBnet.journal. ,vol. 17, pp. 30- ,(2012) , 10.14806/EJ.17.B.286
Junwan Liu, Zhoujun Li, Xiaohua Hu, Yiming Chen, Biclustering of microarray data with MOSPO based on crowding distance BMC Bioinformatics. ,vol. 10, pp. 1- 10 ,(2009) , 10.1186/1471-2105-10-S4-S9