Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis

作者: K Thenmozhi , S Shanthi , M Pyingkodi , None

DOI: 10.31557/APJCP.2018.19.11.3105

关键词:

摘要: Objective: With the over saturating growth of biological sequence databases, handling these amounts data has increasingly become a problem. Clustering one principal research objectives in structural and functional genomics. However, exact clustering algorithms, such as partitioned hierarchical clustering, scale relatively poorly terms run time memory usage with large sets sequences. Methods: From performance limits, heuristic optimizations Cuckoo Search Algorithm genetic operators (ICSA) algorithm have been implemented distributed computing environment. The proposed ICSA, global optimized that can cluster numbers protein sequences by running on hardware. Results: It allocates both resources efficiently. Compare latest results, our method requires only 15% execution obtains even higher quality information sequence. Conclusion: experimental analysis, We noticed using ICSA technique instead alignment methods reduce extremely improve efficiency this important task molecular biology. Moreover, new era proteomics is providing us extensive knowledge mutations other alterations cancer study.

参考文章(16)
Shailza Singh, Balwant Kumar Malik, Durlabh Kumar Sharma, Molecular drug targets and structure based drug design: A holistic approach. Bioinformation. ,vol. 1, pp. 314- 320 ,(2006) , 10.6026/97320630001314
William H. E. Day, Herbert Edelsbrunner, Efficient algorithms for agglomerative hierarchical clustering methods Journal of Classification. ,vol. 1, pp. 7- 24 ,(1984) , 10.1007/BF01890115
Yijun Sun, Yunpeng Cai, Li Liu, Fahong Yu, Michael L. Farrell, William McKendree, William Farmerie, ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences Nucleic Acids Research. ,vol. 37, ,(2009) , 10.1093/NAR/GKP285
Anton J Enright, Stijn Van Dongen, Christos A Ouzounis, An efficient algorithm for large-scale detection of protein families Nucleic Acids Research. ,vol. 30, pp. 1575- 1584 ,(2002) , 10.1093/NAR/30.7.1575
Yuri I Wolf, Igor B Rogozin, Alexey S Kondrashov, Eugene V Koonin, Genome Alignment, Evolution of Prokaryotic Genome Organization, and Prediction of Gene Function Using Genomic Context Genome Research. ,vol. 11, pp. 356- 372 ,(2001) , 10.1101/GR.GR-1619R
J. R. Cole, Q. Wang, E. Cardenas, J. Fish, B. Chai, R. J. Farris, A. S. Kulam-Syed-Mohideen, D. M. McGarrell, T. Marsh, G. M. Garrity, J. M. Tiedje, The Ribosomal Database Project: improved alignments and new tools for rRNA analysis Nucleic Acids Research. ,vol. 37, pp. 141- 145 ,(2009) , 10.1093/NAR/GKN879
João F. Matias Rodrigues, Christian von Mering, HPC-CLUST: distributed hierarchical clustering for large sets of nucleotide sequences Bioinformatics. ,vol. 30, pp. 287- 288 ,(2014) , 10.1093/BIOINFORMATICS/BTT657
Ehsan Amiri, Shadi Mahmoudi, None, Efficient protocol for data clustering by fuzzy Cuckoo Optimization Algorithm soft computing. ,vol. 41, pp. 15- 21 ,(2016) , 10.1016/J.ASOC.2015.12.008