A Normalized Levenshtein Distance Metric

作者: Li Yujian , Liu Bo

DOI: 10.1109/TPAMI.2007.1078

关键词:

摘要: Although a number of normalized edit distances presented so far may offer good performance in some applications, none them can be regarded as genuine metric between strings because they do not satisfy the triangle inequality. Given two X and Y over finite alphabet, this paper defines new distance simple function their lengths (|X| |Y|) Generalized Levenshtein Distance (GLD) them. The easily computed through GLD with complexity O(|X| \cdot it is valued [0, 1] under condition that weight set elementary operations all costs insertions/deletions having same weight. Experiments using AESA algorithm handwritten digit recognition show generally provide similar results to other perform slightly better if inequality violated particular data set.

参考文章(27)
Alan H. Lipkus, A proof of the triangle inequality for the Tanimoto distance Journal of Mathematical Chemistry. ,vol. 26, pp. 263- 265 ,(1999) , 10.1023/A:1019154432472
Omer Egecioglu, Abdullah N. Arslan, Efficient Algorithms For Normalized Edit Distance ,(2000)
V. I. Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals Soviet physics. Doklady. ,vol. 10, pp. 707- 710 ,(1966)
Robert A. Wagner, Michael J. Fischer, The String-to-String Correction Problem Journal of the ACM. ,vol. 21, pp. 168- 173 ,(1974) , 10.1145/321796.321811
Juan Ramón Rico-Juan, Luisa Micó, Comparison of AESA and LAESA search algorithms using string and tree-edit-distances Pattern Recognition Letters. ,vol. 24, pp. 1417- 1426 ,(2003) , 10.1016/S0167-8655(02)00382-3
Peter H. Sellers, On the Theory and Computation of Evolutionary Distances Siam Journal on Applied Mathematics. ,vol. 26, pp. 787- 793 ,(1974) , 10.1137/0126070
Azriel Rosenfeld, A characterization of parallel thinning algorithms Information and Control. ,vol. 29, pp. 286- 291 ,(1975) , 10.1016/S0019-9958(75)90448-9
Francesc Serratosa, Alberto Sanfeliu, Signatures versus histograms: Definitions, distances and algorithms Pattern Recognition. ,vol. 39, pp. 921- 934 ,(2006) , 10.1016/J.PATCOG.2005.12.005
Gonzalo Navarro, A guided tour to approximate string matching ACM Computing Surveys. ,vol. 33, pp. 31- 88 ,(2001) , 10.1145/375360.375365