Deep learning predicts non-coding RNA functions from only raw sequence data

作者: Luigi Cerulo , Michele Ceccarelli , Teresa M.R. Noviello

DOI: 10.1101/2020.05.27.118778

关键词:

摘要: Abstract Non-coding RNAs (ncRNAs) are small non-coding sequences involved in gene regulation many biological processes and diseases. The lack of a complete comprehension their functionality, especially genome-wide scenario, has demanded new computational approaches to annotate roles. It is widely known that secondary structure determinant know RNA function machine learning based have been successfully proven predict from information. Here we show can be predicted with good accuracy raw sequence information without the necessity computing features which computationally expensive. This finding appears go against dogma being key RNA. Compared recent methods, proposed solution more robust boundary noise reduces drastically cost allowing for large data volume annotations. Scripts datasets reproduce results experiments this study available at: https://github.com/bioinformatics-sannio/ncrna-deep

参考文章(16)
Zoubin Ghahramani, Yarin Gal, Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning arXiv: Machine Learning. ,(2015)
Geoffrey E. Hinton, Vinod Nair, Rectified Linear Units Improve Restricted Boltzmann Machines international conference on machine learning. pp. 807- 814 ,(2010)
Paul G. Rothberg, Eckard Wimmer, Mononucleotide and dinucleotide frequencies, and codon usage in poliovirion RNA. Nucleic Acids Research. ,vol. 9, pp. 6221- 6230 ,(1981) , 10.1093/NAR/9.23.6221
Manel Esteller, Non-coding RNAs in human disease Nature Reviews Genetics. ,vol. 12, pp. 861- 874 ,(2011) , 10.1038/NRG3074
David Hilbert, Ueber die stetige Abbildung einer Line auf ein Flächenstück Mathematische Annalen. ,vol. 38, pp. 459- 460 ,(1891) , 10.1007/BF01199431
Ronny Lorenz, Stephan H Bernhart, Christian Höner zu Siederdissen, Hakim Tafer, Christoph Flamm, Peter F Stadler, Ivo L Hofacker, ViennaRNA Package 2.0 Algorithms for Molecular Biology. ,vol. 6, pp. 26- 26 ,(2011) , 10.1186/1748-7188-6-26
Kengo Sato, Yuki Kato, Michiaki Hamada, Tatsuya Akutsu, Kiyoshi Asai, IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming intelligent systems in molecular biology. ,vol. 27, pp. 85- 93 ,(2011) , 10.1093/BIOINFORMATICS/BTR215
E. P. Nawrocki, S. R. Eddy, Infernal 1.1: 100-fold faster RNA homology searches Bioinformatics. ,vol. 29, pp. 2933- 2935 ,(2013) , 10.1093/BIOINFORMATICS/BTT509
Liam Childs, Zoran Nikoloski, Patrick May, Dirk Walther, Identification and classification of ncRNA molecules using graph properties Nucleic Acids Research. ,vol. 37, ,(2009) , 10.1093/NAR/GKP206