Annotation of the Giardia proteome through structure-based homology and machine learning.

作者: Brendan R E Ansell , Bernard J Pope , Peter Georgeson , Samantha J Emery-Corbin , Aaron R Jex

DOI: 10.1093/GIGASCIENCE/GIY150

关键词:

摘要: Background Large-scale computational prediction of protein structures represents a cost-effective alternative to empirical structure determination with particular promise for non-model organisms and neglected pathogens. Conventional sequence-based tools are insufficient annotate the genomes such divergent biological systems. Conversely, tolerates substantial variation in primary amino acid sequence is thus robust indicator biochemical function. Structural proteomics poised become standard part pathogen genomics research; however, informatic methods now required assign confidence large volumes predicted structures. Aims Our aim was predict proteome human pathogen, Giardia duodenalis, stratify into high- lower-confidence categories using variety metrics isolation combination. Methods We used I-TASSER suite structural models ∼5,000 proteins encoded G. duodenalis identify their closest empirically-determined homologues Protein Data Bank. Models were assigned or depending on presence matching family (Pfam) domains query reference peptides. Metrics output from derived assessed ability high-confidence category individually, combination through development random forest classifier. Results identified 1,095 including 212 hypothetical proteins. Amino identity between peptides greatest individual predictor status; classifier outperformed any metric (area under receiver operating characteristic curve = 0.976) subset 305 high-confidence-like models, corresponding false-positive predictions. High-confidence exhibited greater transcriptional abundance, generalized across species, indicating broad utility this approach automatically stratifying Additional structure-based clustering cross-check predictions an expanded Nek kinases. Several yielded new insight mechanisms redox balance duodenalis-a system central efficacy limited anti-giardial drugs. Conclusion combined machine learning can aid genome annotation genetically organisms, pathogens, promote efficient allocation resources experimental investigation.

参考文章(30)
Scott C. Dawson, Susan A. House, Imaging and Analysis of the Microtubule Cytoskeleton in Giardia Microtubules: in vivo. ,vol. 97, pp. 307- 339 ,(2010) , 10.1016/S0091-679X(10)97017-9
T. Wang, K. Birsoy, N. W. Hughes, K. M. Krupczak, Y. Post, J. J. Wei, E. S. Lander, D. M. Sabatini, Identification and characterization of essential genes in the human genome Science. ,vol. 350, pp. 1096- 1101 ,(2015) , 10.1126/SCIENCE.AAC7041
Derek F. Ceccarelli, Leo C.K. Wan, Yu-Chi Juang, Daniel Y.L. Mao, Christina Gaughan, Margo A. Brinton, Andrey A. Perelygin, Igor Kourinov, Alba Guarné, Robert H. Silverman, Frank Sicheri, Hao Huang, Elton Zeqiraj, Beihua Dong, Babal Kant Jha, Nicole M. Duffy, Stephen Orlicky, Neroshan Thevakumaran, Manisha Talukdar, Monica C. Pillon, Dimeric Structure of Pseudokinase RNase L Bound to 2-5A Reveals a Basis for Interferon-Induced Antiviral Activity Molecular Cell. ,vol. 53, pp. 221- 234 ,(2014) , 10.1016/J.MOLCEL.2013.12.025
Hilary G Morrison, Andrew G McArthur, Frances D Gillin, Stephen B Aley, Rodney D Adam, Gary J Olsen, Aaron A Best, W Zacheus Cande, Feng Chen, Michael J Cipriano, Barbara J Davids, Scott C Dawson, Heidi G Elmendorf, Adrian B Hehl, Michael E Holder, Susan M Huse, Ulandt U Kim, Erica Lasek-Nesselquist, Gerard Manning, Anuranjini Nigam, Julie EJ Nixon, Daniel Palm, Nora E Passamaneck, Anjali Prabhu, Claudia I Reich, David S Reiner, John Samuelson, Staffan G Svard, Mitchell L Sogin, Genomic minimalism in the early diverging intestinal parasite Giardia lamblia. Science. ,vol. 317, pp. 1921- 1926 ,(2007) , 10.1126/SCIENCE.1143837
Tarmo P. Roosild, Samantha Castronovo, Samantha Miller, Chan Li, Tim Rasmussen, Wendy Bartlett, Banuri Gunasekera, Senyon Choe, Ian R. Booth, KTN (RCK) domains regulate K+ channels and transporters by controlling the dimer-hinge conformation. Structure. ,vol. 17, pp. 893- 903 ,(2009) , 10.1016/J.STR.2009.03.018
Jianyi Yang, Renxiang Yan, Ambrish Roy, Dong Xu, Jonathan Poisson, Yang Zhang, The I-TASSER Suite: protein structure and function prediction Nature Methods. ,vol. 12, pp. 7- 8 ,(2015) , 10.1038/NMETH.3213
Kayoko Komori, Tomoko Miyata, Jocelyne DiRuggiero, Rhonda Holley-Shanks, Ikuko Hayashi, Isaac K. O. Cann, Kota Mayanagi, Hideo Shinagawa, Yoshizumi Ishino, Both RadA and RadB are involved in homologous recombination in Pyrococcus furiosus. Journal of Biological Chemistry. ,vol. 275, pp. 33782- 33790 ,(2000) , 10.1074/JBC.M004557200
Brendan R.E. Ansell, Malcolm J. McConville, Showgy Y. Ma'ayeh, Michael J. Dagley, Robin B. Gasser, Staffan G. Svärd, Aaron R. Jex, Drug resistance in Giardia duodenalis Biotechnology Advances. ,vol. 33, pp. 888- 901 ,(2015) , 10.1016/J.BIOTECHADV.2015.04.009
Ghulam Jeelani, Afzal Husain, Dan Sato, Vahab Ali, Makoto Suematsu, Tomoyoshi Soga, Tomoyoshi Nozaki, Two Atypical l-Cysteine-regulated NADPH-dependent Oxidoreductases Involved in Redox Maintenance, l-Cystine and Iron Reduction, and Metronidazole Activation in the Enteric Protozoan Entamoeba histolytica Journal of Biological Chemistry. ,vol. 285, pp. 26889- 26899 ,(2010) , 10.1074/JBC.M110.106310
Ambrish Roy, Alper Kucukural, Yang Zhang, I-TASSER: a unified platform for automated protein structure and function prediction Nature Protocols. ,vol. 5, pp. 725- 738 ,(2010) , 10.1038/NPROT.2010.5