Comparison of insertion/deletion calling algorithms on human next-generation sequencing data

作者: Dalia H Ghoneim , Jason R Myers , Emily Tuttle , Alex R Paciorkowski

DOI: 10.1186/1756-0500-7-864

关键词:

摘要: Insertions/deletions (indels) are the second most common type of genomic variant and structural variant. Identification indels in next generation sequencing data is a challenge, algorithms commonly used for indel detection have not been compared on research cohort human subject data. Guidelines optimal biologically significant limited. We analyzed three sets (48 samples 200 gene target exon sequencing, 45 whole exome 2 genome sequencing) using (Pindel, Genome Analysis Tool Kit's UnifiedGenotyper HaplotypeCaller). observed variation calls across algorithms. The intersection tools comprised only 5.70% targeted exon, 19.52% exome, 14.25% calls. majority discordant were lower read depth likely to be false positives. When software parameters kept consistent targets, HaplotypeCaller produced reliable results. Pindel results did validate well without adjustments account varied number per run. Adjustments Pindel's M (minimum support event) parameter improved both concordance validation rates. was able identify large deletions that surpassed length capabilities GATK Despite variability identification, we discerned strengths among individual specific sets. This allowed us suggest best practices calling. low rate made suggests better suited short multi-sample runs targets with very high depth. allows optimization minimum events larger at depths.

参考文章(22)
Ryan E Mills, Christopher T Luttig, Christine E Larkins, Adam Beauchamp, Circe Tsui, W Stephen Pittard, Scott E Devine, An initial map of insertion and deletion (INDEL) variation in the human genome Genome Research. ,vol. 16, pp. 1182- 1190 ,(2006) , 10.1101/GR.4565806
Jun Wang, Wei Wang, Ruiqiang Li, Yingrui Li, Geng Tian, Laurie Goodman, Wei Fan, Junqing Zhang, Jun Li, Juanbin Zhang, Yiran Guo, Binxiao Feng, Heng Li, Yao Lu, Xiaodong Fang, Huiqing Liang, Zhenglin Du, Dong Li, Yiqing Zhao, Yujie Hu, Zhenzhen Yang, Hancheng Zheng, Ines Hellmann, Michael Inouye, John Pool, Xin Yi, Jing Zhao, Jinjie Duan, Yan Zhou, Junjie Qin, Lijia Ma, Guoqing Li, Zhentao Yang, Guojie Zhang, Bin Yang, Chang Yu, Fang Liang, Wenjie Li, Shaochuan Li, Dawei Li, Peixiang Ni, Jue Ruan, Qibin Li, Hongmei Zhu, Dongyuan Liu, Zhike Lu, Ning Li, Guangwu Guo, Jianguo Zhang, Jia Ye, Lin Fang, Qin Hao, Quan Chen, Yu Liang, Yeyang Su, A San, Cuo Ping, Shuang Yang, Fang Chen, Li Li, Ke Zhou, Hongkun Zheng, Yuanyuan Ren, Ling Yang, Yang Gao, Guohua Yang, Zhuo Li, Xiaoli Feng, Karsten Kristiansen, Gane Ka-Shu Wong, Rasmus Nielsen, Richard Durbin, Lars Bolund, Xiuqing Zhang, Songgang Li, Huanming Yang, Jian Wang, None, The diploid genome sequence of an Asian individual. Nature. ,vol. 456, pp. 60- 65 ,(2008) , 10.1038/NATURE07484
David R Bentley, Shankar Balasubramanian, Harold P Swerdlow, Geoffrey P Smith, John Milton, Clive G Brown, Kevin P Hall, Dirk J Evers, Colin L Barnes, Helen R Bignell, Jonathan M Boutell, Jason Bryant, Richard J Carter, R Keira Cheetham, Anthony J Cox, Darren J Ellis, Michael R Flatbush, Niall A Gormley, Sean J Humphray, Leslie J Irving, Mirian S Karbelashvili, Scott M Kirk, Heng Li, Xiaohai Liu, Klaus S Maisinger, Lisa J Murray, Bojan Obradovic, Tobias Ost, Michael L Parkinson, Mark R Pratt, Isabelle MJ Rasolonjatovo, Mark T Reed, Roberto Rigatti, Chiara Rodighiero, Mark T Ross, Andrea Sabot, Subramanian V Sankar, Aylwyn Scally, Gary P Schroth, Mark E Smith, Vincent P Smith, Anastassia Spiridou, Peta E Torrance, Svilen S Tzonev, Eric H Vermaas, Klaudia Walter, Xiaolin Wu, Lu Zhang, Mohammed D Alam, Carole Anastasi, Ify C Aniebo, David MD Bailey, Iain R Bancarz, Saibal Banerjee, Selena G Barbour, Primo A Baybayan, Vincent A Benoit, Kevin F Benson, Claire Bevis, Phillip J Black, Asha Boodhun, Joe S Brennan, John A Bridgham, Rob C Brown, Andrew A Brown, Dale H Buermann, Abass A Bundu, James C Burrows, Nigel P Carter, Nestor Castillo, Maria Chiara E. Catenazzi, Simon Chang, R Neil Cooley, Natasha R Crake, Olubunmi O Dada, Konstantinos D Diakoumakos, Belen Dominguez-Fernandez, David J Earnshaw, Ugonna C Egbujor, David W Elmore, Sergey S Etchin, Mark R Ewan, Milan Fedurco, Louise J Fraser, Karin V Fuentes Fajardo, W Scott Furey, David George, Kimberley J Gietzen, Colin P Goddard, George S Golda, Philip A Granieri, David E Green, David L Gustafson, Nancy F Hansen, Kevin Harnish, Christian D Haudenschild, Narinder I Heyer, Matthew M Hims, Johnny T Ho, Adrian M Horgan, Katya Hoschler, Steve Hurwitz, Denis V Ivanov, Maria Q Johnson, Terena James, TA Huw Jones, Gyoung-Dong Kang, Tzvetana H Kerelska, Alan D Kersey, Irina Khrebtukova, Alex P Kindwall, Zoya Kingsbury, Paula I Kokko-Gonzales, Anil Kumar, Marc A Laurent, Cynthia T Lawley, Sarah E Lee, Xavier Lee, Arnold K Liao, Jennifer A Loch, Mitch Lok, Shujun Luo, Radhika M Mammen, John W Martin, Patrick G McCauley, Paul McNitt, Parul Mehta, Keith W Moon, Joe W Mullens, Taksina Newington, Zemin Ning, Bee Ling Ng, Sonia M Novo, Michael J O’Neill, Mark A Osborne, Andrew Osnowski, Omead Ostadan, Lambros L Paraschos, Lea Pickering, Andrew C Pike, Alger C Pike, D Chris Pinkard, Daniel P Pliskin, Joe Podhasky, Victor J Quijano, Come Raczy, Vicki H Rae, Stephen R Rawlings, Ana Chiva Rodriguez, Phyllida M Roe, None, Accurate whole human genome sequencing using reversible terminator chemistry Nature. ,vol. 456, pp. 53- 59 ,(2008) , 10.1038/NATURE07517
Robert Daber, Shrey Sukhadia, Jennifer J.D. Morrissette, Understanding the limitations of next generation sequencing informatics, an approach to clinical pipeline validation using artificial data sets Cancer Genetics and Cytogenetics. ,vol. 206, pp. 441- 448 ,(2013) , 10.1016/J.CANCERGEN.2013.11.005
Emre Karakoc, Can Alkan, Brian J O'Roak, Megan Y Dennis, Laura Vives, Kenneth Mark, Mark J Rieder, Debbie A Nickerson, Evan E Eichler, Detection of structural variants and indels within exome data. Nature Methods. ,vol. 9, pp. 176- 178 ,(2012) , 10.1038/NMETH.1810
Andy Rimmer, Hang Phan, Iain Mathieson, Zamin Iqbal, Stephen RF Twigg, WGS500 Consortium, Andrew OM Wilkie, Gil McVean, Gerton Lunter, None, Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications Nature Genetics. ,vol. 46, pp. 912- 918 ,(2014) , 10.1038/NG.3036
Francesco Lescai, Elena Marasco, Chiara Bacchelli, Philip Stanier, Vilma Mantovani, Philip Beales, Identification and validation of loss of function variants in clinical contexts. Molecular Genetics & Genomic Medicine. ,vol. 2, pp. 58- 63 ,(2014) , 10.1002/MGG3.42
Debra L. Schutte, Meridean Maas, Kathleen C. Buckwalter, A LRPAP1 intronic insertion/deletion polymorphism and phenotypic variability in Alzheimer disease. Research and Theory for Nursing Practice. ,vol. 17, pp. 301- 319 ,(2003) , 10.1891/RTNP.17.4.301.53188
Han Fang, Yiyang Wu, Giuseppe Narzisi, Jason A ORawe, Laura T Jimenez Barrón, Julie Rosenbaum, Michael Ronemus, Ivan Iossifov, Michael C Schatz, Gholson J Lyon, Reducing INDEL calling errors in whole genome and exome sequencing data Genome Medicine. ,vol. 6, pp. 89- 89 ,(2014) , 10.1186/S13073-014-0089-Z
Clea Bárcena, Víctor Quesada, Annachiara De Sandre-Giovannoli, Diana A Puente, Joaquín Fernández-Toral, Sabine Sigaudy, Anwar Baban, Nicolas Lévy, Gloria Velasco, Carlos López-Otín, Exome sequencing identifies a novel mutation in PIK3R1 as the cause of SHORT syndrome BMC Medical Genetics. ,vol. 15, pp. 51- 51 ,(2014) , 10.1186/1471-2350-15-51