Filter Based Fast Matching of Long Patterns by Using SIMD Instructions.

作者: M. Oguzhan Külekci

DOI:

关键词:

摘要: SIMD instructions exist in many recent microprocessors supporting parallel execution of some operations on multiple data simultaneously via a set special working limited number registers. Although the usage is explored deeply multimedia processing, implementation encryption/decryption algorithms, and scientific calculations, it has not been much addressed pattern matching. This study introduces filter based exact matching algorithm for searching long strings benefiting from Intel’s SSE (streaming extensions) technology. The proposed worst, best, average time complexities O(n · m), O(n/m), O(n/m + n m/2) respectively, while an m bytes text bytes. Experiments small, medium, large alphabet files are conducted to compare performance new with other alternatives, which known be very fast string search operations. In all cases clear winner average. When compared nearest successor, speed improved orders magnitude small sequences. 40 % better medium alphabets, 50 natural language text.

参考文章(12)
Kimmo Fredriksson, Szymon Grabowski, Practical and Optimal String Matching String Processing and Information Retrieval. pp. 376- 387 ,(2005) , 10.1007/11575832_42
Hannu Peltola, Jorma Tarhio, Alternative Algorithms for Bit-Parallel String Matching string processing and information retrieval. pp. 80- 93 ,(2003) , 10.1007/978-3-540-39984-1_7
Thierry Lecroq, Christian Charras, Handbook of Exact String Matching Algorithms ,(2004)
Cyril Allauzen, Maxime Crochemore, Mathieu Raffinot, Factor Oracle: A New Structure for Pattern Matching conference on current trends in theory and practice of informatics. ,vol. 1725, pp. 295- 310 ,(1999) , 10.1007/3-540-47849-3_18
Kimmo Fredriksson, Faster String Matching with Super-Alphabets string processing and information retrieval. pp. 44- 57 ,(2002) , 10.1007/3-540-45735-6_5
Ricardo Baeza-Yates, Gaston H. Gonnet, A new approach to text searching Communications of The ACM. ,vol. 35, pp. 74- 82 ,(1992) , 10.1145/135239.135243
M. Hassaballah, S. Omran, Y. B. Mahdy, A Review of SIMD Multimedia Extensions and their Usage in Scientific and Engineering Applications The Computer Journal. ,vol. 51, pp. 630- 649 ,(2008) , 10.1093/COMJNL/BXM099
Sun Wu, Udi Manber, Fast text searching: allowing errors Communications of The ACM. ,vol. 35, pp. 83- 91 ,(1992) , 10.1145/135239.135244
Thierry Lecroq, Fast exact string matching algorithms Information Processing Letters. ,vol. 102, pp. 229- 235 ,(2007) , 10.1016/J.IPL.2007.01.002
Gonzalo Navarro, Mathieu Raffinot, Fast and flexible string matching by combining bit-parallelism and suffix automata ACM Journal of Experimental Algorithms. ,vol. 5, pp. 4- ,(2000) , 10.1145/351827.384246