作者: Tomasz Kosciolek , David T. Jones
DOI: 10.1371/JOURNAL.PONE.0092197
关键词: Algorithm 、 Bioinformatics 、 Sequence alignment 、 Multiple sequence alignment 、 Protein structure prediction 、 Covariance 、 Estimation of covariance matrices 、 Globular protein 、 Protein domain 、 Protein structure 、 Biology
摘要: The advent of high accuracy residue-residue intra-protein contact prediction methods enabled a significant boost in the quality de novo structure predictions. Here, we investigate potential benefits combining well-established fragment-based folding algorithm – FRAGFOLD, with PSICOV, method which uses sparse inverse covariance estimation to identify co-varying sites multiple sequence alignments. Using comprehensive set 150 diverse globular target proteins, up 266 amino acids length, are able address effectiveness and some limitations such approaches proteins practice. Overall find that using fragment assembly both statistical potentials predicted contacts is significantly better than either or alone. Results show nearly 80% correct predictions (TM-score ≥0.5) within analysed dataset mean TM-score 0.54. Unsuccessful modelling cases emerged from conformational sampling problems, insufficient accuracy. Nevertheless, strong dependency final models on fraction satisfied long-range was observed. This not only highlights importance these determining protein fold, but also (combined other ensemble-derived qualities) provides powerful guide as choice global selected model. A proposed assessment scoring function achieves 0.93 precision 0.77 recall for discrimination folds our decoys. These findings suggest approach well-suited blind variety unknown 3D structure, provided enough homologous sequences available construct large accurate alignment initial step.