An Iterative Model for Discovering Person Coreferences Using Name Frequency Estimates

作者： Octavian Popescu , Bernardo Magnini

关键词:

摘要: In this paper we present an approach to person coreference in a large collection of news, based on two main hypothesis: first, is iterative process, where the easy cases are addressed first and then made available as incrementally enriched resource for resolving more difficult cases. Second, at each iteration among names established according probabilistic model, number features (e.g. frequency last names) taken into account. The does not assume any prior knowledge about persons mentioned requires basic linguistic processing (named entity recognition) resources (a dictionary names). system parameters have been estimated Italian news corpus 5K experimented containing than 7 millions names. Evaluation, over sample four days shows that error rate (1.4%) above baseline (5.4%) task. Finally, discuss open issues evaluation.

参考文章(8)

Bernardo Magnini, Emanuele Pianta, Manuela Speranza, Octavian Popescu, Ontology Population from Textual Mentions: Task Definition and Benchmark meeting of the association for computational linguistics. pp. 26- 32 ,(2006)

Ted Pedersen, Amruta Purandare, Anagha Kulkarni, Name Discrimination by Clustering Similar Contexts Computational Linguistics and Intelligent Text Processing. ,vol. 3406, pp. 226- 237 ,(2005) , 10.1007/978-3-540-30586-6_24

Bernardo Magnini, Emanuele Pianta, Luciano Serafini, Manuela Speranza, Octavian Popescu, From Mentions to Ontology: A Pilot Sudy. semantic web applications and perspectives. ,(2006)

Javier Artiles, Julio Gonzalo, Satoshi Sekine, The SemEval-2007 WePS Evaluation: Establishing a benchmark for the Web People Search Task meeting of the association for computational linguistics. pp. 64- 69 ,(2007) , 10.3115/1621474.1621486

Ralph Grishman, Whither written language evaluation? Proceedings of the workshop on Human Language Technology - HLT '94. pp. 120- 125 ,(1994) , 10.3115/1075812.1075836

Giovanni Carlo Ettorre, L’intelligenza artificiale Springer, Milano. pp. 197- 210 ,(2010) , 10.1007/978-88-470-1667-5_16

Bernardo Magnini, V. Bartalesi, Emanuele Pianta, Rachele Sprugnoli, Lorenza Romano, Christian Girardi, Manuela Speranza, Matteo Negri, I-CAB: the Italian Content Annotation Bank language resources and evaluation. pp. 963- 968 ,(2006)

Amit Bagga, Breck Baldwin, Entity-based cross-document coreferencing using the Vector Space Model Proceedings of the 17th international conference on Computational linguistics -. ,(1998) , 10.3115/980451.980859

An Iterative Model for Discovering Person Coreferences Using Name Frequency Estimates

来源期刊

我的账户

An Iterative Model for Discovering Person Coreferences Using Name Frequency Estimates

来源期刊

相似文章 2

Methods of estimating the number of clusters for person cross document coreference task

Language Independent First and Last Name Identification in Person Names

我的账户