An Iterative Model for Discovering Person Coreferences Using Name Frequency Estimates

作者: Octavian Popescu , Bernardo Magnini

DOI: 10.1007/978-3-642-04235-5_37

关键词:

摘要: In this paper we present an approach to person coreference in a large collection of news, based on two main hypothesis: first, is iterative process, where the easy cases are addressed first and then made available as incrementally enriched resource for resolving more difficult cases. Second, at each iteration among names established according probabilistic model, number features (e.g. frequency last names) taken into account. The does not assume any prior knowledge about persons mentioned requires basic linguistic processing (named entity recognition) resources (a dictionary names). system parameters have been estimated Italian news corpus 5K experimented containing than 7 millions names. Evaluation, over sample four days shows that error rate (1.4%) above baseline (5.4%) task. Finally, discuss open issues evaluation.

参考文章(8)
Bernardo Magnini, Emanuele Pianta, Manuela Speranza, Octavian Popescu, Ontology Population from Textual Mentions: Task Definition and Benchmark meeting of the association for computational linguistics. pp. 26- 32 ,(2006)
Ted Pedersen, Amruta Purandare, Anagha Kulkarni, Name Discrimination by Clustering Similar Contexts Computational Linguistics and Intelligent Text Processing. ,vol. 3406, pp. 226- 237 ,(2005) , 10.1007/978-3-540-30586-6_24
Bernardo Magnini, Emanuele Pianta, Luciano Serafini, Manuela Speranza, Octavian Popescu, From Mentions to Ontology: A Pilot Sudy. semantic web applications and perspectives. ,(2006)
Javier Artiles, Julio Gonzalo, Satoshi Sekine, The SemEval-2007 WePS Evaluation: Establishing a benchmark for the Web People Search Task meeting of the association for computational linguistics. pp. 64- 69 ,(2007) , 10.3115/1621474.1621486
Ralph Grishman, Whither written language evaluation? Proceedings of the workshop on Human Language Technology - HLT '94. pp. 120- 125 ,(1994) , 10.3115/1075812.1075836
Giovanni Carlo Ettorre, L’intelligenza artificiale Springer, Milano. pp. 197- 210 ,(2010) , 10.1007/978-88-470-1667-5_16
Bernardo Magnini, V. Bartalesi, Emanuele Pianta, Rachele Sprugnoli, Lorenza Romano, Christian Girardi, Manuela Speranza, Matteo Negri, I-CAB: the Italian Content Annotation Bank language resources and evaluation. pp. 963- 968 ,(2006)
Amit Bagga, Breck Baldwin, Entity-based cross-document coreferencing using the Vector Space Model Proceedings of the 17th international conference on Computational linguistics -. ,(1998) , 10.3115/980451.980859