作者: Dalia H Ghoneim , Jason R Myers , Emily Tuttle , Alex R Paciorkowski
关键词:
摘要: Insertions/deletions (indels) are the second most common type of genomic variant and structural variant. Identification indels in next generation sequencing data is a challenge, algorithms commonly used for indel detection have not been compared on research cohort human subject data. Guidelines optimal biologically significant limited. We analyzed three sets (48 samples 200 gene target exon sequencing, 45 whole exome 2 genome sequencing) using (Pindel, Genome Analysis Tool Kit's UnifiedGenotyper HaplotypeCaller). observed variation calls across algorithms. The intersection tools comprised only 5.70% targeted exon, 19.52% exome, 14.25% calls. majority discordant were lower read depth likely to be false positives. When software parameters kept consistent targets, HaplotypeCaller produced reliable results. Pindel results did validate well without adjustments account varied number per run. Adjustments Pindel's M (minimum support event) parameter improved both concordance validation rates. was able identify large deletions that surpassed length capabilities GATK Despite variability identification, we discerned strengths among individual specific sets. This allowed us suggest best practices calling. low rate made suggests better suited short multi-sample runs targets with very high depth. allows optimization minimum events larger at depths.