作者: Rolf S Kaas , Carsten Friis , David W Ussery , Frank M Aarestrup , None
关键词:
摘要: Escherichia coli exists in commensal and pathogenic forms. By measuring the variation of individual genes across more than a hundred sequenced genomes, gene can be studied detail, including number mutations found for any given gene. This knowledge will useful creating better phylogenies, determination molecular clocks improved typing techniques. We find 3,051 clusters/families present at least 95% genomes 1,702 clusters 100% genomes. The former 'soft core' about 3,000 families is perhaps biologically relevant, especially considering that many these genome sequences are draft quality. E. pan-genome this set isolates contains 16,373 clusters. A core-gene tree, based on alignment tree presence/absence, maps relatedness 186 displays high confidence divides strains into observed MLST type clades also separates defined phylotypes. results comparing large diverse dataset support theory reliable good resolution phylogenies inferred from core-genome. further suggest isolate level may, subsequently by targeting variable genes. use whole sequencing make it possible to eliminate, or reduce, need several steps used traditional epidemiology.