X-Diff: an effective change detection algorithm for XML documents

作者: Y. Wang , D.J. DeWitt , J.-Y. Cai

DOI: 10.1109/ICDE.2003.1260818

关键词:

摘要: XML has become the de facto standard format for Web publishing and data transportation. Since online information changes frequently, being able to quickly detect in documents is important Internet query systems, search engines, continuous systems. Previous work change detection on XML, or other hierarchically structured documents, used an ordered tree model, which left-to-right order among siblings it can affect result. We argue that unordered model (only ancestor relationships are significant) more suitable most database applications. Using substantially harder than using but result generates accurate. propose X-Diff, effective algorithm integrates key structure characteristics with tree-to-tree correction techniques. The analyzed compared XyDiff [CAM02], a published diff algorithm. An experimental evaluation both algorithms provided.

参考文章(21)
Steven J. DeRose, Inso Corp, XML Path Language (XPath) Version 1.0 ,(1999)
Fred Douglis, Thomas Ball, Tracking and viewing changes on the web usenix annual technical conference. pp. 14- 14 ,(1996)
Vidur Apparao, Gavin Nicol, Mike Champion, Chris Wilson, Inso Eps, Jonathan Robie, Lauren Wood, Document Object Model (DOM) Level 1 Specification (Second Edition) ,(2000)
Joe Marini, Document Object Model ,(2002)
Kaizhong Zhang, A New Editing based Distance between Unordered Labeled Trees combinatorial pattern matching. pp. 254- 265 ,(1993) , 10.1007/BFB0029810
Robert Endre Tarjan, Data Structures and Network Algorithms ,(1983)
Fred Douglis, Thomas Ball, Yih‐Farn Chen, Eleftherios Koutsofios, The AT&T Internet Difference Engine: Tracking and viewing changes on the web World Wide Web. ,vol. 1, pp. 27- 44 ,(1998) , 10.1023/A:1019243126596
Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Extensible Markup Language (XML). World Wide Web. ,vol. 2, pp. 27- 66 ,(1997)
Kuo-Chung Tai, The Tree-to-Tree Correction Problem Journal of the ACM. ,vol. 26, pp. 422- 433 ,(1979) , 10.1145/322139.322143
Kaizhong Zhang, Dennis Shasha, Simple fast algorithms for the editing distance between trees and related problems SIAM Journal on Computing. ,vol. 18, pp. 1245- 1262 ,(1989) , 10.1137/0218082