Prediction of novel pre-microRNAs with high accuracy through boosting and SVM

作者: Yuanwei Zhang , Yifan Yang , Huan Zhang , Xiaohua Jiang , Bo Xu

DOI: 10.1093/BIOINFORMATICS/BTR148

关键词: Boosting (machine learning)Support vector machineData miningExpressed sequence tagMachine learningFeature selectionComputer scienceArtificial intelligence

摘要: Summary: High-throughput deep-sequencing technology has generated an unprecedented number of expressed short sequence reads, presenting not only opportunity but also a challenge for prediction novel microRNAs. To verify the existence candidate microRNAs, we have to show that these sequences can be processed from pre-microRNAs. However, it is laborious and time consuming using existing experimental techniques. Therefore, here, describe new method, miRD, which constructed two feature selection strategies based on support vector machines (SVMs) boosting method. It high-efficiency tool pre-microRNA with accuracy up 94.0% among different species. Availability: miRD implemented in PHP/PERL+MySQL+R freely accessed at http://mcg.ustc.edu.cn/rpg/mird/mird.php. Contact: qshi@ustc.edu.cn Supplementary information:Supplementary data are available Bioinformatics online.

参考文章(16)
Jian-Fu Chen, Elizabeth M Mandel, J Michael Thomson, Qiulian Wu, Thomas E Callis, Scott M Hammond, Frank L Conlon, Da-Zhi Wang, None, The role of microRNA-1 and microRNA-133 in skeletal muscle proliferation and differentiation Nature Genetics. ,vol. 38, pp. 228- 233 ,(2006) , 10.1038/NG1725
H. Wu, J. Tao, P. J. Chen, A. Shahab, W. Ge, R. P. Hart, X. Ruan, Y. Ruan, Y. E. Sun, Genome-wide analysis reveals methyl-CpG–binding protein 2–dependent regulation of microRNAs in a mouse model of Rett syndrome Proceedings of the National Academy of Sciences of the United States of America. ,vol. 107, pp. 18161- 18166 ,(2010) , 10.1073/PNAS.1005595107
Michael A Quail, Iwanka Kozarewa, Frances Smith, Aylwyn Scally, Philip J Stephens, Richard Durbin, Harold Swerdlow, Daniel J Turner, A large genome center's improvements to the Illumina sequencing system Nature Methods. ,vol. 5, pp. 1005- 1010 ,(2008) , 10.1038/NMETH.1270
Chih-Hung Hsieh, Darby Chang, Cheng-Hao Hsueh, Chi-Yeh Wu, Yen-Jen Oyang, Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm. BMC Bioinformatics. ,vol. 11, pp. 1- 9 ,(2010) , 10.1186/1471-2105-11-S1-S52
Andrew S. Yoo, Brett T. Staahl, Lei Chen, Gerald R. Crabtree, MicroRNA-mediated switching of chromatin-remodelling complexes in neural development Nature. ,vol. 460, pp. 642- 646 ,(2009) , 10.1038/NATURE08139
Lin He, J. Michael Thomson, Michael T. Hemann, Eva Hernando-Monge, David Mu, Summer Goodson, Scott Powers, Carlos Cordon-Cardo, Scott W. Lowe, Gregory J. Hannon, Scott M. Hammond, A microRNA polycistron as a potential human oncogene Nature. ,vol. 435, pp. 828- 833 ,(2005) , 10.1038/NATURE03552
Richard W. Carthew, Erik J. Sontheimer, Origins and Mechanisms of miRNAs and siRNAs Cell. ,vol. 136, pp. 642- 655 ,(2009) , 10.1016/J.CELL.2009.01.035
Ting-Hua Huang, Bin Fan, Max F Rothschild, Zhi-Liang Hu, Kui Li, Shu-Hong Zhao, MiRFinder: an improved approach and software implementation for genome-wide fast microRNA precursor scans BMC Bioinformatics. ,vol. 8, pp. 341- 341 ,(2007) , 10.1186/1471-2105-8-341
Robert E. Schapire, Yoav Freund, Experiments with a new boosting algorithm international conference on machine learning. pp. 148- 156 ,(1996)
Y. Xu, X. Zhou, W. Zhang, MicroRNA prediction with a novel ranking algorithm based on random walks intelligent systems in molecular biology. ,vol. 24, pp. 50- 58 ,(2008) , 10.1093/BIOINFORMATICS/BTN175