作者: Heikki Hyyrö
DOI:
关键词:
摘要: The edit distance between strings A and B is defined as the minimum number of operations needed in converting into or vice versa. Levenshtein allows three types operations: an insertion, a deletion substitution character. Damerau previous plus addition transposition two adjacent characters. To our best knowledge current practical algorithms for computing these distances run time O(dm) O(⌈m/w⌉(n + σ)), where d strings, m n are their lengths (m ≤ n), w computer word size σ alphabet. In this paper we present algorithm that runs O(⌈d/w⌉m ⌈n/w⌉σ) O(⌈d/w⌉n ⌈m/w⌉σ). structure such, practice it mostly suitable testing whether within some pre-determined error threshold. We also initial test results with thresholded computation. them works faster than original Myers.