作者: Junho Kim , Ju Heon Maeng , Jae Seok Lim , Hyeonju Son , Junehawk Lee
DOI: 10.1093/BIOINFORMATICS/BTW383
关键词:
摘要: Motivation Advances in sequencing technologies have remarkably lowered the detection limit of somatic variants to a low frequency. However, calling mutations at this range is still confounded by many factors including environmental contamination. Vector contamination continuously occurring issue and especially problematic since vector inserts are hardly distinguishable from sample sequences. Such inserts, which may harbor polymorphisms engineered functional mutations, can result false corresponding sites. Numerous vector-screening methods been developed, but none could handle because they focusing on backbone sequences alone. Results We developed novel method-Vecuum-that identifies vector-originated reads resultant variants. Since generally constructed intron-less cDNAs, Vecuum inspecting clipping patterns exon junctions. False variant calls further detected based biased distribution mutant alleles reads. Tests simulated spike-in experimental data validated that detect 93% contaminants remove up 87% variant-like with 100% precision. Application public sequence datasets demonstrated utility detecting resulting various types external Availability implementation Java-based method available http://vecuum.sourceforge.net/ CONTACT: swkim@yuhs.acSupplementary information: Supplementary Bioinformatics online.