作者: Kristoffer Sahlin
DOI: 10.1101/2021.01.28.428549
关键词: Selection (genetic algorithm) 、 Indel 、 Algorithm 、 Mutation rate 、 Sensitivity (control systems) 、 Sequence (medicine) 、 Variable (computer science) 、 Permutation 、 Computer science
摘要: K-mer-based methods are widely used in bioinformatics for various types of sequence comparison. However, a single mutation will mutate k consecutive k-mers and makes most k-mer based applications comparison sensitive to variable rates. Many techniques have been studied overcome this sensitivity, e.g., spaced permutation techniques, but these do not handle indels well. For indels, pairs or groups small commonly used, first produce matches, only second step, pairing grouping is performed. Such many redundant matches due the size k. Here, we propose strobemers as an alternative Intuitively, consists linked minimizers. We show that under certain minimizer selection technique, provide more evenly distributed than less different rates distributions. Strobemers also give higher total coverage across sequences. useful performing comparisons read alignment, clustering, classification, error-correction.