Speeding up pattern matching by optimal partial string extraction

作者: Jianlong Tan , Xia Liu , Yanbing Liu , Ping Liu

DOI: 10.1109/INFCOMW.2011.5928778

关键词:

摘要: String matching plays a key role in web content monitoring systems. Suffix algorithms have good time efficiency, and thus are widely used. These require that all patterns set the same length. When cannot satisfy this requirement, leftmost characters, m being length of shortest pattern, extracted to construct data structure. We call such -character strings partial strings. However, simple extraction from left does not address impact string locations on search speed. propose novel method extract each pattern which maximizes More specifically, with we can compute corresponding searching cost by theoretical derivation, choose location yields an approximately minimal time. evaluate our two rule sets: Snort ClamAV. Experiments show most cases, achieves fastest speed possible extraction, is about 5%–20% faster than alternative methods.

参考文章(8)
Cyril Allauzen, Maxime Crochemore, Mathieu Raffinot, Efficient Experimental String Matching by Weak Factor Recognition combinatorial pattern matching. pp. 51- 72 ,(2001) , 10.1007/3-540-48194-X_5
Francisco Claude, Gonzalo Navarro, Hannu Peltola, Leena Salmela, Jorma Tarhio, Speeding Up Pattern Matching by Text Sampling String Processing and Information Retrieval. pp. 87- 98 ,(2008) , 10.1007/978-3-540-89097-3_10
Aho AV, JE Hopcroft, JD Ullman, The Design and Analysis of Computer Algorithms ,(1974)
Leena Salmela, Jorma Tarhio, Jari Kytöjoki, Multipattern string matching with q-grams ACM Journal of Experimental Algorithms. ,vol. 11, pp. 1- 19 ,(2007) , 10.1145/1187436.1187438
Anat Bremler-Barr, David Hay, Yaron Koral, CompactDFA: Generic State Machine Compression for Scalable Pattern Matching international conference on computer communications. pp. 659- 667 ,(2010) , 10.1109/INFCOM.2010.5462160
Alfred V. Aho, Margaret J. Corasick, Efficient string matching: an aid to bibliographic search Communications of The ACM. ,vol. 18, pp. 333- 340 ,(1975) , 10.1145/360825.360855