作者: A. A. Mitchell , M. E. Zwick , A. Chakravarti , D. J. Cutler
DOI: 10.1093/BIOINFORMATICS/BTH034
关键词:
摘要: Summary: Three recent publications have examined the quality and completeness of public database single nucleotide polymorphism (dbSNP) come to dramatically different conclusions regarding dbSNPs false positive rate proportion that are expected be common. These studies employed genotyping technologies protocols in determining minimum acceptable thresholds. Because heterozygous sites typically lower scores than homozygous sites, a higher threshold reduces number SNPs, but yields fewer heterozygotes leads confirmed SNPs. To account for confirmation rates distributions minor allele frequencies, we propose three negative rates. We developed mathematical model predict SNP apparent distribution frequencies under user-specified applied this published our own resequencing effort. conclude dbSNP is ∼15--17% reported vastly error patterns.