A Large Scale Name Matching and Search Framework

作者: Stella Margonar

DOI:

关键词:

摘要: The ability of identifying whether two strings represent names referring to the same real world entity is essential for avoiding information integration problems, such as duplication records. We study this problem in a scenario where amount data analyze becomes large. Our purpose develop framework that address name match and search problem, combining together different strategies, able consider also semantic string representing name. Moreover we propose dataset evaluating matching algorithm which variation names.

参考文章(22)
W. W. Cohen and P. Ravikumar and S. Fienberg, A Comparison of String Metrics for Matching Names and Records ,(2003)
Heikki Keskustalo, Ari Pirkola, Kari Visala, Erkka Leppänen, Kalervo Järvelin, Non-adjacent Digrams Improve Matching of Cross-Lingual Spelling Variants string processing and information retrieval. pp. 252- 265 ,(2003) , 10.1007/978-3-540-39984-1_19
Chew Lim Tan, Jian Su, Wen Ting Wang, Wei Zhang, Entity Linking Leveraging Automatically Generated Annotation international conference on computational linguistics. pp. 1290- 1298 ,(2010)
Bignotti Enrico, Semantic Name Matching Università di Trento. ,(2013)
Frankie Patman, Paul Thompson, Names: a new frontier in text mining intelligence and security informatics. pp. 27- 38 ,(2003) , 10.1007/3-540-44853-5_3
Hinrich Schütze, Christopher D. Manning, Prabhakar Raghavan, Introduction to Information Retrieval ,(2005)
Razvan C. Bunescu, Marius Pasca, Using Encyclopedic Knowledge for Named Entity Disambiguation conference of the european chapter of the association for computational linguistics. ,(2006)
Ralph Grishman, Andrew Borthwick, Eugene Agichtein, John Sterling, Exploiting diverse knowledge sources via maximum entropy in named entity recognition meeting of the association for computational linguistics. ,(1998)
Pouliquen Bruno, Ralf Steinberger, Camelia Ignat, Irina Temnikova, Anna Widiger, Wajdi Zaghouani, Jan Žižka, Multilingual person name recognition and transliteration Corela. ,(2005) , 10.4000/CORELA.1219
Beatrice T. Oshika, Bruce Evans, Filip Machi, Janet Tom, COMPUTATIONAL TECHNIQUES FOR IMPROVED NAME SEARCH conference on applied natural language processing. pp. 203- 210 ,(1988) , 10.3115/974235.974273