作者: S. Gravel , B. M. Henn , R. N. Gutenkunst , A. R. Indap , G. T. Marth
关键词:
摘要: High-throughput sequencing technology enables population-level surveys of human genomic variation. Here, we examine the joint allele frequency distributions across continental populations and present an approach for combining complementary aspects whole-genome, low-coverage data targeted high-coverage data. We apply this to generated by pilot phase Thousand Genomes Project, including whole-genome 2–4× coverage 179 samples from HapMap European, Asian, African panels as well target exons 800 genes 697 individuals in seven populations. use site spectra obtained these infer demographic parameters Out-of-Africa model African, Asian descent predict, a jackknife-based approach, amount genetic diversity that will be discovered sample sizes are increased. predict number nonsynonymous coding variants reach 100,000 each population after ∼1,000 sequenced chromosomes per population, whereas ∼2,500 needed same synonymous variants. Beyond point, segregating sites European panel is expected overcome because faster recent growth. Overall, find majority variable rare exhibit little sharing among diverged Our results emphasize replication disease association specific must both reduced statistical power rarity higher divergence.