Robust system for interactively learning a record similarity measurement

作者: Douglas Burdick , Robert Szczerba

DOI:

关键词:

摘要: A system learns a record similarity measurement. The includes set of clusters. Each in each cluster may have list fields and data contained field. further include predetermined threshold score for two the records one clusters to be considered similar at least decision tree constructed from portion encodes rules determining field related fields. an output pairs that are determined duplicate records. greater than or equal score.

参考文章(15)
James Dolter, Robert Solmer, Christopher Harris, Mauritius Schmidtler, Efficient method for information extraction ,(2002)
Cormac Twomey, Jeffrey M. Greif, Bradley P. Allen, Michael W. Lo, David L. Adam, John B. Jensen, Online predictive memory ,(1998)
Gordon Thomas Wilfong, Ramanujan S. Kashi, Jianying Hu, Method for document comparison and classification using document image layout ,(1999)
Kenneth P. Baclawski, Knowledge extraction system and method ,(1999)