Memory-Aware BWT by Segmenting Sequences to Support Subsequence Search

作者: Jiaying Wang , Xiaochun Yang , Bin Wang , Huaijie Zhu , None

DOI: 10.1007/978-3-642-29253-8_7

关键词:

摘要: Nowadays, Burrows-Wheeler transform (BWT) has been receiving significant attentions in academia for addressing subsequence matching problems. Although BWT is a typical technique to sequence into new that "easy compress", it can also be extended as kind of full text index techniques. Traditional requires nlogn+nlogσ bits build with n characters, where σ size the alphabet. Building long on PCs limited memory great challenge. In order solve problem, we propose novel variation named S-BWT, which separates source segments. It reduce cost n(logσ+logn−logk )/k bits, k number However, querying each segment separately using existing approaches undertake risk losing some results. this paper, two query methods based S-BWT and guarantee find all occurrences. Our not only require small space, but are faster than state-of-art backward search method sequence.

参考文章(24)
Maxime Crochemore, Costas Iliopoulos, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, Extracting powers and periods in a string from its runs structure string processing and information retrieval. ,vol. 6393, pp. 258- 269 ,(2010) , 10.1007/978-3-642-16321-0_27
Juha Kärkkäinen, Stefan Burkhardt, E. Chávez, M. Crochemore, R. Baeza-Yates, Fast Lightweight Suffix Array Construction and Checking Untitled Event. pp. 55- 69 ,(2003)
Feifeng Zheng, Stanley P. Y. Fung, Wun-Tat Chan, Francis Y. L. Chin, Chung Keung Poon, Prudence W. H. Wong, Improved on-line broadcast scheduling with deadlines computing and combinatorics conference. pp. 320- 329 ,(2006) , 10.1007/11809678_34
Jouni Sirén, Niko Välimäki, Veli Mäkinen, Gonzalo Navarro, Run-Length Compressed Indexes Are Superior for Highly Repetitive Sequence Collections String Processing and Information Retrieval. pp. 164- 175 ,(2008) , 10.1007/978-3-540-89097-3_17
Simon J. Puglisi, W. F. Smyth, Andrew H. Turpin, A taxonomy of suffix array construction algorithms ACM Computing Surveys. ,vol. 39, pp. 4- ,(2007) , 10.1145/1242471.1242472
Wing-Kai Hon, Tak-Wah Lam, Kunihiko Sadakane, Wing-Kin Sung, Siu-Ming Yiu, A Space and Time Efficient Algorithm for Constructing Compressed Suffix Arrays Algorithmica. ,vol. 48, pp. 23- 36 ,(2007) , 10.1007/S00453-006-1228-8
Dong Kyue Kim, Jeong Seop Sim, Heejin Park, Kunsoo Park, Constructing suffix arrays in linear time Journal of Discrete Algorithms. ,vol. 3, pp. 126- 142 ,(2005) , 10.1016/J.JDA.2004.08.019
Juha Kärkkäinen, Peter Sanders, Stefan Burkhardt, Linear work suffix array construction Journal of the ACM. ,vol. 53, pp. 918- 936 ,(2006) , 10.1145/1217856.1217858
Roman Dementiev, Juha Kärkkäinen, Jens Mehnert, Peter Sanders, Better external memory suffix array construction ACM Journal of Experimental Algorithmics. ,vol. 12, pp. 1- 24 ,(2008) , 10.1145/1227161.1402296