作者: Elizabeth M. Blue , Lei Sun , Nathan L. Tintle , Ellen M. Wijsman
DOI: 10.1002/GEPI.21821
关键词:
摘要: When analyzing family data, we dream of perfectly informative even whole-genome sequences (WGSs) for all members. Reality intervenes, and find that next-generation sequencing (NGS) data have errors are often too expensive or impossible to collect on everyone. The Genetic Analysis Workshop 18 working groups quality control dropping WGSs through families using a genome-wide association framework focused finding, correcting, within the available sequence developing methods infer analyze missing among relatives, testing linkage with simulated blood pressure. We found single-nucleotide polymorphisms, NGS imputed generally concordant but particularly likely at rare variants, homozygous genotypes, regions repeated structural from unrelated individuals. Admixture complicated identification cryptic relatedness, information Mendelian transmission improved error detection provided an estimate de novo mutation rate. Computationally, fast rule-based imputation was accurate could not cover as many loci subjects more computationally demanding probability-based methods. Incorporating population-level into pedigree-based results. Observed outperformed in testing, were also useful. discuss strengths weaknesses existing suggest possible future directions, such improving communication between collectors analysts, establishing thresholds quality, incorporating analytical models.