Entity resolution incorporating data from various data sources which uses tokens and normalizes records

作者: Murtaza Muidul Huda Chowdhury , Satish J. Thomas

DOI:

关键词:

摘要: A pair of records is tokenized to form a normalized representation an entity represented by each record. The tokens are correlated machine learning system determining whether learned resolution already exists for the two entities. If not, compared generate comparison measure determine match. can also be used perform web search and results as additional matching. When match found, updated indicate that they match, provided update resolutions.

参考文章(25)
Greg Bolcer, Alan Chaney, Clay Cover, Enterprise data processing ,(2013)
Benjamin Rubinstein, David James Gemmell, Ashok K. Chandra, Olivier Jerzy Dabrowski, Entity based search and resolution ,(2011)
Surajit Chaudhuri, Rahul Kapoor, Venkatesh Ganti, Duplicate data elimination system ,(2003)
Giuseppe Di Fabbrizio, Amanda Stent, Srinivas Bangalore, System and method for referring to entities in a discourse domain ,(2008)
David Guy Brizan, Abdullah Uz Tansel, A Survey of Entity Resolution and Record Linkage Methodologies Communications of the IIMA. ,vol. 6, pp. 5- ,(2006)