作者: Octavian Popescu , Bernardo Magnini
DOI: 10.1007/978-3-642-04235-5_37
关键词:
摘要: In this paper we present an approach to person coreference in a large collection of news, based on two main hypothesis: first, is iterative process, where the easy cases are addressed first and then made available as incrementally enriched resource for resolving more difficult cases. Second, at each iteration among names established according probabilistic model, number features (e.g. frequency last names) taken into account. The does not assume any prior knowledge about persons mentioned requires basic linguistic processing (named entity recognition) resources (a dictionary names). system parameters have been estimated Italian news corpus 5K experimented containing than 7 millions names. Evaluation, over sample four days shows that error rate (1.4%) above baseline (5.4%) task. Finally, discuss open issues evaluation.