Accelerating Boyer Moore searches on binary texts

作者: Shmuel T. Klein , Miri Kopel Ben-Nissan

DOI: 10.1007/978-3-540-76336-9_14

关键词: BytePattern matchingAlgorithmString searching algorithmBinary dataBinary numberComputer scienceBoyer–Moore string search algorithmMatching (graph theory)New variant

摘要: The Boyer and Moore (BM) pattern matching algorithm is considered as one of the best, but its performance reduced on binary data. Yet, searching in texts has important applications, such compressed matching. paper shows how, by means some precomputed tables, may implement BM also for case without referring to bits, processing only entire blocks bytes or words, thereby significantly reducing number comparisons. Empirical comparisons show that new variant performs better than regular even BDM.

参考文章(13)
Nieves R. Brisaboa, Antonio Fariña, Gonzalo Navarro, María F. Esteller, (S,C)-Dense Coding: An Optimized Compression Code for Natural Language Text Databases string processing and information retrieval. pp. 122- 136 ,(2003) , 10.1007/978-3-540-39984-1_10
Gonzalo Navarro, Jorma Tarhio, Boyer-Moore String Matching over Ziv-Lempel Compressed Text combinatorial pattern matching. pp. 166- 180 ,(2000) , 10.1007/3-540-45123-4_16
Yusuke Shibata, Tetsuya Matsumoto, Masayuki Takeda, Ayumi Shinohara, Setsuo Arikawa, A Boyer-Moore Type Algorithm for Compressed Pattern Matching combinatorial pattern matching. pp. 181- 194 ,(2000) , 10.1007/3-540-45123-4_17
Kimmo Fredriksson, Faster String Matching with Super-Alphabets string processing and information retrieval. pp. 44- 57 ,(2002) , 10.1007/3-540-45735-6_5
Y. Choueka, S. T. Klein, Y. Perl, Efficient variants of Huffman codes in high level languages Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '85. pp. 122- 130 ,(1985) , 10.1145/253495.342777
Donald E Knuth, James H Morris, Jr, Vaughan R Pratt, Fast Pattern Matching in Strings SIAM Journal on Computing. ,vol. 6, pp. 323- 350 ,(1977) , 10.1137/0206024
R. Nigel Horspool, Practical Fast Searching in Strings Software - Practice and Experience. ,vol. 10, pp. 501- 506 ,(1980) , 10.1002/SPE.4380100608
M. Crochemore, A. Czumaj, L. Gasieniec, S. Jarominek, T. Lecroq, W. Plandowski, W. Rytter, Speeding up two string-matching algorithms Algorithmica. ,vol. 12, pp. 247- 267 ,(1994) , 10.1007/BF01185427
Shmuel T. Klein, Abraham Bookstein, Scott Deerwester, Storing text retrieval systems on CD-ROM: compression and encryption considerations ACM Transactions on Information Systems. ,vol. 7, pp. 230- 245 ,(1989) , 10.1145/65943.65946
Edleno Silva de Moura, Gonzalo Navarro, Nivio Ziviani, Ricardo Baeza-Yates, Fast and flexible word searching on compressed text ACM Transactions on Information Systems. ,vol. 18, pp. 113- 139 ,(2000) , 10.1145/348751.348754