Fast and flexible packed string matching

作者: Simone Faro , M. Oğuzhan Külekci

DOI: 10.1016/J.JDA.2014.07.003

关键词: Boyer–Moore string search algorithmTime complexityCommentz-Walter algorithmComputer scienceString metricRabin–Karp algorithmStability (learning theory)Streaming SIMD ExtensionsAlgorithmString searching algorithm

摘要: Searching for all occurrences of a pattern in text is fundamental problem computer science with applications many other fields, like natural language processing, information retrieval and computational biology. In the last two decades general trend has appeared trying to exploit power word RAM model speed-up performances classical string matching algorithms. this an algorithm operates on words length w, grouping blocks characters, arithmetic logic operations take one unit time.In paper we use specialized word-size packed instructions, based Intel streaming SIMD extensions (SSE) technology, design very fast algorithm. We evaluate our solution terms efficiency, stability flexibility, where propose deviation running time distinct equal patterns as measure stability.From experimental results it turns out that, despite their quadratic worst case complexity, new presented becomes clear winner average cases, when compared against most recent effective algorithms known literature. A instructions.The works using technology.We prove effectiveness times, flexibility.Our be fastest short case.We conduct discuss wide series tests.

参考文章(33)
M. Oguzhan Külekci, Filter Based Fast Matching of Long Patterns by Using SIMD Instructions. prague stringology conference. pp. 118- 128 ,(2009)
Djamal Belazzougui, Mathieu Raffinot, Average Optimal String Matching in Packed Strings international conference on algorithms and complexity. pp. 37- 48 ,(2013) , 10.1007/978-3-642-38233-8_4
Simone Faro, M. Oğuzhan Külekci, Fast multiple string matching using streaming SIMD extensions technology string processing and information retrieval. pp. 217- 228 ,(2012) , 10.1007/978-3-642-34109-0_23
Simone Faro, Domenico Cantone, Fast-Search Algorithms: New Efficient Variants of the Boyer-Moore Pattern-Matching Algorithm. Journal of Automata, Languages and Combinatorics. ,vol. 10, pp. 589- 608 ,(2005)
Simone Faro, Thierry Lecroq, A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms Experimental Algorithms. pp. 172- 183 ,(2012) , 10.1007/978-3-642-30850-5_16
Branislav Ďurian, Jorma Tarhio, Hannu Peltola, Jan Holub, Tuning BNDM with > q -grams algorithm engineering and experimentation. pp. 29- 37 ,(2009)
Kimmo Fredriksson, Szymon Grabowski, Practical and Optimal String Matching String Processing and Information Retrieval. pp. 376- 387 ,(2005) , 10.1007/11575832_42
Thierry Lecroq, Christian Charras, Handbook of Exact String Matching Algorithms ,(2004)
Rahul Thathoo, Ashish Virmani, S Sai Lakshmi, N Balakrishnan, K Sekar, None, TVSBS: a fast exact pattern matching algorithm for biological sequences Current Science. ,vol. 91, pp. 47- 53 ,(2006)
Simone Faro, Thierry Lecroq, An Efficient Matching Algorithm for Encoded DNA Sequences and Binary Strings combinatorial pattern matching. pp. 106- 115 ,(2009) , 10.1007/978-3-642-02441-2_10