作者: Owen White , Ted Dunning , Granger Sutton , Mark Adams , J. Craig Venter
关键词: Heterologous 、 Genome 、 DNA 、 Sequence analysis 、 Expressed sequence tag 、 Sequence alignment 、 Genetics 、 Genomic library 、 DNA sequencing 、 Biology
摘要: Heterologous DNA sequences from rearrangements with the genomes of host cells, genomic fragments hybrid or impure tissue sources can threaten purity libraries that are derived RNA DNA. Hybridization methods only detect contaminants known suspected heterologous sources, and whole library screening is technically very difficult. Detection contaminating clones by sequence alignment possible when related present in a database. We have developed statistical test to identify based on differences hexamer composition different organisms. This does not require similar potential database, principle contamination previously unknown applied this major public expressed tag (EST) data sets evaluate its utility as quality control measure peer evaluation tool. There detectable heterogeneity most human C.elegans EST but it apparently associated cross-species contamination. However, there direct evidence for both yeast bacterial some database annotated human. Results obtained been confirmed similarity searches using relevant sets.