String matching with variable length gaps

作者: Philip Bille , Inge Li Gørtz , Hjalte Wedel Vildhøj , David Kofoed Wind

DOI: 10.1007/978-3-642-16321-0_40

关键词:

摘要: We consider string matching with variable length gaps. Given a T and pattern P consisting of strings separated by gaps (arbitrary in specified range), the problem is to find all ending positions substrings that match P. This basic primitive computational biology applications. Let m n be lengths T, respectively, let k number present new algorithm achieving time O((n+m) log k+α) space O(m+A), where A sum lower bounds α total occurrences within T. Compared previous results this bound essentially achieves best known complexities simultaneously. Consequently, our obtains for almost combinations m, n, k, A, α. Our surprisingly simple straightforward implement.

参考文章(21)
Kostas Tsichlas, Wojciech Rytter, Athanasios Tsakalidis, Maxime Crochemore, Costas Iliopoulos, Christos Makris, Approximate string matching with gaps Nordic Journal of Computing. ,vol. 9, pp. 54- 65 ,(2002)
M. Sohel Rahman, Costas S. Iliopoulos, Inbok Lee, Manal Mohamed, William F. Smyth, Finding patterns with variable length gaps or don’t cares computing and combinatorics conference. pp. 146- 155 ,(2006) , 10.1007/11809678_17
Amos Marc Bairoch, Philip Bucher, A Generalized Profile Syntax for Biomolecular Sequence Motifs and its Function in Automatic Sequence Interpretation intelligent systems in molecular biology. ,vol. 2, pp. 53- 61 ,(1994)
Philip Bille, Mikkel Thorup, Faster Regular Expression Matching Automata, Languages and Programming. pp. 171- 182 ,(2009) , 10.1007/978-3-642-02927-1_16
Kimmo Fredriksson, Szymon Grabowski, Nested Counters in Bit-Parallel String Matching Language and Automata Theory and Applications. pp. 338- 349 ,(2009) , 10.1007/978-3-642-00982-2_29
Philip Bille, New Algorithms for Regular Expression Matching Automata, Languages and Programming. pp. 643- 654 ,(2006) , 10.1007/11786986_56
Donald E Knuth, James H Morris, Jr, Vaughan R Pratt, Fast Pattern Matching in Strings SIAM Journal on Computing. ,vol. 6, pp. 323- 350 ,(1977) , 10.1137/0206024
Kimmo Fredriksson, Szymon Grabowski, Efficient algorithms for pattern matching with general gaps, character classes, and transposition invariance Information Retrieval. ,vol. 11, pp. 335- 357 ,(2008) , 10.1007/S10791-008-9054-Z
Gonzalo Navarro, Mathieu Raffinot, Fast and simple character classes and bounded gaps pattern matching, with applications to protein searching. Journal of Computational Biology. ,vol. 10, pp. 903- 923 ,(2003) , 10.1089/106652703322756140
Mikkel Thorup, Philip Bille, Regular expression matching with multi-strings and intervals symposium on discrete algorithms. pp. 1297- 1308 ,(2010) , 10.5555/1873601.1873705