Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

作者: Rolf S Kaas , Carsten Friis , David W Ussery , Frank M Aarestrup , None

DOI: 10.1186/1471-2164-13-577

关键词:

摘要: Escherichia coli exists in commensal and pathogenic forms. By measuring the variation of individual genes across more than a hundred sequenced genomes, gene can be studied detail, including number mutations found for any given gene. This knowledge will useful creating better phylogenies, determination molecular clocks improved typing techniques. We find 3,051 clusters/families present at least 95% genomes 1,702 clusters 100% genomes. The former 'soft core' about 3,000 families is perhaps biologically relevant, especially considering that many these genome sequences are draft quality. E. pan-genome this set isolates contains 16,373 clusters. A core-gene tree, based on alignment tree presence/absence, maps relatedness 186 displays high confidence divides strains into observed MLST type clades also separates defined phylotypes. results comparing large diverse dataset support theory reliable good resolution phylogenies inferred from core-genome. further suggest isolate level may, subsequently by targeting variable genes. use whole sequencing make it possible to eliminate, or reduce, need several steps used traditional epidemiology.

参考文章(52)
Michael P. Cummings, PHYLIP (Phylogeny Inference Package) Dictionary of Bioinformatics and Computational Biology. ,(2004) , 10.1002/0471650129.DOB0534
Olivier Clermont, David M. Gordon, Sylvain Brisse, Seth T. Walk, Erick Denamur, Characterization of the cryptic Escherichia lineages: rapid identification and prevalence Environmental Microbiology. ,vol. 13, pp. 2468- 2477 ,(2011) , 10.1111/J.1462-2920.2011.02519.X
Stephan Hutter, Albert J Vilella, Julio Rozas, Genome-wide DNA polymorphism analyses using VariScan BMC Bioinformatics. ,vol. 7, pp. 409- 409 ,(2006) , 10.1186/1471-2105-7-409
David M. Gordon, Olivier Clermont, Heather Tolley, Erick Denamur, Assigning Escherichia coli strains to phylogenetic groups: multi‐locus sequence typing versus the PCR triplex method Environmental Microbiology. ,vol. 10, pp. 2484- 2496 ,(2008) , 10.1111/J.1462-2920.2008.01669.X
Lars Snipen, David W. Ussery, Standard operating procedure for computing pangenome trees Standards in Genomic Sciences. ,vol. 2, pp. 135- 141 ,(2010) , 10.4056/SIGS.38923
Lars Snipen, Trygve Almøy, David W Ussery, Microbial comparative pan-genomics using binomial mixture models. BMC Genomics. ,vol. 10, pp. 385- 385 ,(2009) , 10.1186/1471-2164-10-385
Karl Heinz Schleifer, Classification of Bacteria and Archaea: past, present and future. Systematic and Applied Microbiology. ,vol. 32, pp. 533- 542 ,(2009) , 10.1016/J.SYAPM.2009.09.002
S. R. Harris, E. J. Feil, M. T. G. Holden, M. A. Quail, E. K. Nickerson, N. Chantratita, S. Gardete, A. Tavares, N. Day, J. A. Lindsay, J. D. Edgeworth, H. de Lencastre, J. Parkhill, S. J. Peacock, S. D. Bentley, Evolution of MRSA During Hospital Transmission and Intercontinental Spread Science. ,vol. 327, pp. 469- 474 ,(2010) , 10.1126/SCIENCE.1182395
Olivier Tenaillon, David Skurnik, Bertrand Picard, Erick Denamur, The population genetics of commensal Escherichia coli Nature Reviews Microbiology. ,vol. 8, pp. 207- 217 ,(2010) , 10.1038/NRMICRO2298
Masatoshi Nei, Wen-Hsiung Li, Mathematical model for studying genetic variation in terms of restriction endonucleases Proceedings of the National Academy of Sciences of the United States of America. ,vol. 76, pp. 5269- 5273 ,(1979) , 10.1073/PNAS.76.10.5269