作者: Ivo Hedtke , Ioana Lemnian , Matthias Müller-Hannemann , Ivo Grosse
DOI: 10.1007/978-3-319-07953-0_7
关键词: Heuristics 、 Block (data storage) 、 Trimming 、 Constrained optimization problem 、 Algorithm 、 Almost surely 、 Computer science 、 Quality (business)
摘要: Read trimming is a fundamental first step of the analysis next generation sequencing (NGS) data. Traditionally, read performed heuristically, and algorithmic work in this area has been neglected. Here, we address topic formulate three constrained optimization problems for block-based trimming, i.e., truncating same low-quality positions at both ends all reads removing truncated reads. We find that are \(\mathcal{NP}\)-hard. However, non-random distribution quality scores NGS data sets makes it tempting to speculate constraints typically satisfied by fulfilling Based on speculation, propose relaxed develop efficient polynomial-time algorithms them. (i) omitted indeed almost always (ii) yield higher number untrimmed bases than traditional heuristics.