Space efficient streaming algorithms for the distance to monotonicity and asymmetric edit distance

作者: Michael Saks , C. Seshadhri

DOI: 10.5555/2627817.2627939

关键词:

摘要: Approximating the length of longest increasing sequence (LIS) an array is a well-studied problem. We study this problem in data stream model, where algorithm allowed to make single left-to-right pass through and key resource be minimized amount additional memory used. present which, for any δ > 0, given streaming access n provides (1 + δ)-multiplicative approximation distance monotonicity (n minus LIS), uses only O((log2n)/δ) space. The previous best known using polylogarithmic space was multiplicative 2-factor. improved factor reflects qualitative difference between our algorithms: algorithms could not reliably detect subsequences as large n/2, while ours can βn β 0. More precisely, used estimate LIS within additive δn 0 achieve error n(1/2 -- o(1)).Our very simple, being just 3 lines pseudocode, has small update time. It essentially approximate implementation classic dynamic program that computes LIS.We also show how technique applied other problems solvable by programs. For example, we give approximating LCS(x, y), common subsequence strings x y, each n. Our works asymmetric setting (inspired [AKO10]), which have random y x, runs provided no symbol appears often y. it gives additive-δn y) (and hence E(x, = edit when insertions deletions, but substitutions, are allowed), with complexity O(k(log2n)/δ), k maximum number times one y.We provide deterministic 1-pass outputs (which δn-approximation), setting, O(√n/δ log(n)) All these obtained carefully trading accuracy standard program.

参考文章(28)
ELDAR FISCHER, THE ART OF UNINFORMED DECISIONS: A PRIMER TO PROPERTY TESTING Current Trends in Theoretical Computer Science. pp. 229- 263 ,(2004) , 10.1142/9789812562494_0014
Eyal Kushilevitz, Shirley Halevy, Distribution-Free Property Testing. randomization and approximation techniques in computer science. pp. 302- 317 ,(2003)
Piotr Indyk, Robert Krauthgamer, Alexandr Andoni, Overcoming the l1 non-embeddability barrier: algorithms for product metrics symposium on discrete algorithms. pp. 865- 874 ,(2009) , 10.5555/1496770.1496864
Alexandr Andoni, Robert Krauthgamer, The Smoothed Complexity of Edit Distance international colloquium on automata languages and programming. pp. 357- 369 ,(2008) , 10.1007/978-3-540-70575-8_30
Yevgeniy Dodis, Oded Goldreich, Eric Lehman, Sofya Raskhodnikova, Dana Ron, Alex Samorodnitsky, Improved Testing Algorithms for Monotonicity Randomization, Approximation, and Combinatorial Optimization. Algorithms and Techniques. pp. 97- 108 ,(1999) , 10.1007/978-3-540-48413-4_10
Gonzalo Navarro, A guided tour to approximate string matching ACM Computing Surveys. ,vol. 33, pp. 31- 88 ,(2001) , 10.1145/375360.375365
William J. Masek, Michael S. Paterson, A faster algorithm computing string edit distances Journal of Computer and System Sciences. ,vol. 20, pp. 18- 31 ,(1980) , 10.1016/0022-0000(80)90002-1
C. Schensted, Longest Increasing and Decreasing Subsequences Canadian Journal of Mathematics. ,vol. 13, pp. 299- 311 ,(1961) , 10.1007/978-0-8176-4842-8_21
Robert Krauthgamer, T. S. Jayram, Parikshit Gopalan, Ravi Kumar, Estimating the sortedness of a data stream symposium on discrete algorithms. pp. 318- 327 ,(2007) , 10.5555/1283383.1283417
Nir Ailon, Bernard Chazelle, Ding Liu, Seshadhri Comandur, Estimating the distance to a monotone function Random Structures and Algorithms. ,vol. 31, pp. 371- 383 ,(2007) , 10.1002/RSA.V31:3