Forecasting residue-residue contact prediction accuracy.

作者: P P Wozniak , B M Konopka , J Xu , G Vriend , M Kotulska

DOI: 10.1093/BIOINFORMATICS/BTX416

关键词: Protein secondary structureSolvent accessibilityExperimental researchStatisticsPercentage pointResidue (complex analysis)Regression analysisMathematicsSupplementary data

摘要: Motivation: Apart from meta-predictors, most of today's methods for residue-residue contact prediction are based entirely on Direct Coupling Analysis (DCA) correlated mutations in multiple sequence alignments (MSAs). These average approximately 40% correct the 100 strongest predicted contacts each protein. The end-user who works a single protein interest will not know if predictions either much more or less than 40%, which is especially problem to steer experimental research that Results: We designed regression model forecasts accuracy individual proteins with an error 7 percentage points. Contacts were two DCA (gplmDCA and PSICOV). models built parameters describe MSA, secondary structure, solvent accessibility scores target Results show our can be also applied meta-methods, was tested RaptorX. Availability implementation: All data scripts available http://comprec-lin.iiar.pwr.edu.pl/dcaQ/. Contact: malgorzata.kotulska@pwr.edu.pl. Supplementary information: at Bioinformatics online.

参考文章(68)
Matthew Wiener, Andy Liaw, Classification and Regression by randomForest ,(2007)
Robert Tibshirani, Trevor Hastie, Daniela Witten, Gareth James, An Introduction to Statistical Learning: With Applications in R ,(2013)
Javier Iserte, Franco L. Simonetti, Diego J. Zea, Elin Teppa, Cristina Marino-Buslje, I-COMS: Interprotein-COrrelated Mutations Server Nucleic Acids Research. ,vol. 43, ,(2015) , 10.1093/NAR/GKV572
Rémi Monasson, Alexander Schug, Martin Weigt, Martin Weigt, Simona Cocco, Sebastian Ratz, Eleonora De Leonardis, Eleonora De Leonardis, Benjamin Lutz, Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction Nucleic Acids Research. ,vol. 43, pp. 10444- 10455 ,(2015) , 10.1093/NAR/GKV932
Christopher Bystroff, Vesteinn Thorsson, David Baker, HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. Journal of Molecular Biology. ,vol. 301, pp. 173- 190 ,(2000) , 10.1006/JMBI.2000.3837
Jose M Duarte, Rajagopal Sathyapriya, Henning Stehr, Ioannis Filippis, Michael Lappe, Optimal contact definition for reconstruction of Contact Maps BMC Bioinformatics. ,vol. 11, pp. 283- 283 ,(2010) , 10.1186/1471-2105-11-283
Jakob Bohr, Henrik Bohr, Søren Brunak, Rodney M.J. Cotterill, Henrik Fredholm, Benny Lautrup, Steffen B. Petersen, Protein structures from distance inequalities Journal of Molecular Biology. ,vol. 231, pp. 861- 869 ,(1993) , 10.1006/JMBI.1993.1332
F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. S. Marks, C. Sander, R. Zecchina, J. N. Onuchic, T. Hwa, M. Weigt, Direct-coupling analysis of residue coevolution captures native contacts across many protein families Proceedings of the National Academy of Sciences of the United States of America. ,vol. 108, pp. 19459- 19460 ,(2011) , 10.1073/PNAS.1111471108