作者: David H. Spencer , Manoj Tyagi , Francesco Vallania , Andrew J. Bredemeyer , John D. Pfeifer
DOI: 10.1016/J.JMOLDX.2013.09.003
关键词: Genome 、 Coding region 、 Allele frequency 、 Polymorphism (computer science) 、 Genotype 、 Biology 、 Sequence analysis 、 Genetics 、 Allele 、 Nucleotide
摘要: Next-generation sequencing (NGS) is becoming a common approach for clinical testing of oncology specimens mutations in cancer genes. Unlike inherited variants, may occur at low frequencies because contamination from normal cells or tumor heterogeneity and can therefore be challenging to detect using NGS analysis tools, which are often designed constitutional genomic studies. We generated high-coverage (>1000×) data synthetic DNA mixtures with variant allele fractions (VAFs) 25% 2.5% assess the performance four callers, SAMtools, Genome Analysis Toolkit, VarScan2, SPLINTER, detecting low-frequency variants. SAMtools had lowest sensitivity detected only 49% variants VAFs approximately 25%; whereas SPLINTER least 94% 10%. VarScan2 achieved sensitivities 97% 89%, respectively, observed 1% 8%, >98% >99% positive predictive value coding regions. Coverage demonstrated that >500× coverage was required optimal performance. The specificity improved higher coverage, yielded more false results high levels, although this effect abrogated by removing low-quality reads before identification. Finally, we demonstrate utility high-sensitivity callers 15 lung cancers.