Speeding Up Pattern Matching by Text Sampling

作者: Francisco Claude , Gonzalo Navarro , Hannu Peltola , Leena Salmela , Jorma Tarhio

DOI: 10.1007/978-3-540-89097-3_10

关键词: Factor (programming language)Space (punctuation)Pattern matchingArtificial intelligenceSampling (statistics)String searching algorithmPattern recognitionMathematicsSubsequenceOnline algorithmSuffix array

摘要: We introduce a novel alphabet sampling technique for speeding up both online and indexed string matching. choose subset of the select corresponding subsequence text. Online or searching is then carried out on that subsequence, candidate matches are verified in full show this speeds searching, especially moderate to long patterns, by factor 5. For we achieve indexes as fast classical suffix array, yet occupy space less than 0.5 times text size (instead 4) plus Our experiments no competitive alternatives wide space/time range.

参考文章(18)
Erkki Sutinen, Ricardo A. Baeza-Yates, Jorma Tarhio, Gonzalo Navarro, Indexing methods for approximate string matching IEEE Data(base) Engineering Bulletin. ,vol. 24, pp. 19- 27 ,(2001)
Feifeng Zheng, Stanley P. Y. Fung, Wun-Tat Chan, Francis Y. L. Chin, Chung Keung Poon, Prudence W. H. Wong, Improved on-line broadcast scheduling with deadlines computing and combinatorics conference. pp. 320- 329 ,(2006) , 10.1007/11809678_34
Rodrigo González, Gonzalo Navarro, Compressed Text Indexes with Fast Locate Combinatorial Pattern Matching. pp. 216- 227 ,(2007) , 10.1007/978-3-540-73437-6_23
Juha Kärkkäinen, Esko Ukkonen, Sparse suffix trees Lecture Notes in Computer Science. pp. 219- 230 ,(1996) , 10.1007/3-540-61332-3_155
Paolo Ferragina, Johannes Fischer, Suffix Arrays on Words Combinatorial Pattern Matching. pp. 328- 339 ,(2007) , 10.1007/978-3-540-73437-6_33
Niklaus Wirth, Algorithms and data structures 288 p. : ill. Englewood, New Jersey: Prentice-Hall Inc., 1986. includes bibliography and index. ,(1986)
Jussi Rautio, Jani Tanninen, Jorma Tarhio, String Matching with Stopper Encoding and Code Splitting combinatorial pattern matching. pp. 42- 52 ,(2002) , 10.1007/3-540-45452-7_5
Ricardo A. Baeza-Yates, String Searching Algorithms Revisited workshop on algorithms and data structures. pp. 75- 96 ,(1989) , 10.1007/3-540-51542-9_9
Donald E Knuth, James H Morris, Jr, Vaughan R Pratt, Fast Pattern Matching in Strings SIAM Journal on Computing. ,vol. 6, pp. 323- 350 ,(1977) , 10.1137/0206024