作者: Surajit Chaudhuri , Rahul Kapoor , Venkatesh Ganti
DOI:
关键词: Graph (abstract data type) 、 Tuple 、 Data records 、 Computer science 、 Information retrieval
摘要: A process for finding a similar data records from set of records. database table or tables provide number which one more canonical are identified. Tokens identified within the and classified according to attribute field. similarity score is assigned in relation other based on between tokens Data whose with respect each greater than threshold form groups The tuples nodes graph wherein edges represent group. Within group record