State-of-the-art structural variant calling: What went conceptually wrong and how to fix it?

作者: Arne Kutzner , Markus Schmidt

DOI: 10.1101/2021.01.12.426317

关键词:

摘要: Structural variant (SV) calling belongs to the standard tools of modern bioinformatics for identifying and describing alterations in genomes. Initially, this work presents several complex genomic rearrangements that reveal conceptual ambiguities inherent SV representations state-of-the-art callers. We contextualize these theoretically as well practically propose a graph-based approach resolving them. Our graph model unifies both strands using concept skew-symmetry; it supports genomes general pan specific. Instances our are inferred directly from seeds instead commonly used alignments conflict with various types reported here. For yeast genomes, we compute adjacency matrices demonstrate they provide highly accurate descriptions one genome terms another. An open-source prototype implementation is available under MIT license at https://github.com/ITBE-Lab/MA.

参考文章(39)
Enno Ohlebusch, Mohamed I. Abouelhoda, Chaining Algorithms and Applications in Comparative Genomics ,(2004)
Pavel Pevzner, Glenn Tesler, Genome rearrangements in mammalian evolution: lessons from human and mouse genomes. Genome Research. ,vol. 13, pp. 37- 45 ,(2003) , 10.1101/GR.757503
Robert J. Beynon, Computing in the biological sciences--a survey. Bioinformatics. ,vol. 1, pp. 7- 9 ,(1985) , 10.1093/BIOINFORMATICS/1.1.7
O GOTOH, Optimal sequence alignment allowing for long gaps Bulletin of Mathematical Biology. ,vol. 52, pp. 359- 373 ,(1990) , 10.1016/S0092-8240(05)80216-2
Saul B. Needleman, Christian D. Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins Journal of Molecular Biology. ,vol. 48, pp. 443- 453 ,(1970) , 10.1016/0022-2836(70)90057-4
T.F. Smith, M.S. Waterman, Identification of common molecular subsequences. Journal of Molecular Biology. ,vol. 147, pp. 195- 197 ,(1981) , 10.1016/0022-2836(81)90087-5
Ryan P. Abo, Matthew Ducar, Elizabeth P. Garcia, Aaron R. Thorner, Vanesa Rojas-Rudilla, Ling Lin, Lynette M. Sholl, William C. Hahn, Matthew Meyerson, Neal I. Lindeman, Paul Van Hummelen, Laura E. MacConaill, BreaKmer: detection of structural variation in targeted massively parallel sequencing data using kmers Nucleic Acids Research. ,vol. 43, ,(2015) , 10.1093/NAR/GKU1211
Ryan M Layer, Colby Chiang, Aaron R Quinlan, Ira M Hall, LUMPY: a probabilistic framework for structural variant discovery Genome Biology. ,vol. 15, pp. 1- 19 ,(2014) , 10.1186/GB-2014-15-6-R84
Glenn Hickey, Benedict Paten, Dent Earl, Daniel Zerbino, David Haussler, HAL: a hierarchical format for storing and analyzing multiple genome alignments Bioinformatics. ,vol. 29, pp. 1341- 1342 ,(2013) , 10.1093/BIOINFORMATICS/BTT128