Towards Robust Unsupervised Personal Name Disambiguation

作者: James Martin , Ying Chen

DOI:

关键词:

摘要: The increasing use of large open-domain document sources is exacerbating the problem ambiguity in named entities. This paper explores a range syntactic and semantic features unsupervised clustering documents that result from ad hoc queries containing names. From these experiments, we find robust can significantly improve state art for disambiguation performance personal names both Chinese English.

参考文章(15)
Oren Etzioni, Alexander Yates, Unsupervised Resolution of Objects and Relations on the Web north american chapter of the association for computational linguistics. pp. 121- 130 ,(2007)
Bob Schalkwijk, J. M. Cohen, Mexico City : México Spring Books. ,(1965)
Ted Pedersen, Amruta Purandare, Anagha Kulkarni, Name Discrimination by Clustering Similar Contexts Computational Linguistics and Intelligent Text Processing. ,vol. 3406, pp. 226- 237 ,(2005) , 10.1007/978-3-540-30586-6_24
Chung H. Gooi, James Allan, Cross-Document Coreference on a Large Scale Corpus north american chapter of the association for computational linguistics. pp. 9- 16 ,(2004) , 10.21236/ADA458579
Ted Pedersen, Anagha Kulkarni, Unsupervised Discrimination of Person Names in Web Contexts Computational Linguistics and Intelligent Text Processing. pp. 299- 310 ,(2007) , 10.1007/978-3-540-70939-8_27
Byung-Won On, Dongwon Lee, Scalable Name Disambiguation using Multi-level Graph Partition. siam international conference on data mining. pp. 575- 580 ,(2007) , 10.1137/1.9781611972771.64
Xin Li, Paul Morie, Dan Roth, Robust Reading: Identification and Tracing of Ambiguous Names north american chapter of the association for computational linguistics. pp. 17- 24 ,(2004) , 10.21236/ADA457894
Kadri Hacioglu, A lightweight semantic chunking model based on tagging Proceedings of HLT-NAACL 2004: Short Papers on XX - HLT-NAACL '04. pp. 145- 148 ,(2004) , 10.3115/1613984.1614021