Large-Scale Chinese Cross-Document Entity Disambiguation and Information Fusion

作者： Xiaoge Li , Sugang Ma , Xiaohui Zhou

关键词: Pipeline (software) 、 Entity–relationship model 、 Graph (abstract data type) 、 Artificial intelligence 、 Natural language processing 、 Modular design 、 Information extraction 、 Computer science 、 Alias 、 Knowledge base 、 Scale (map)

摘要: Cross-document entity disambiguation is the problem of identifying whether mentions from different documents refer to same or distinct entities and rises in information fusion automated knowledge base construction. In this paper, we describe a Chinese Information Extraction (IE) system based on Hadoop Framework, which involves document-level IE corpus-level IE, pipeline multi-level modular approach Name Entity Recognitions (EDR), relationship extraction fusion. associated with each mention name can be merged into rich profiles for our co-reference alias modular, performed agglomerative hierarchical clustering using Map Reduce. The visualized results centric graph have been demonstrated.

springer.com 本地加速

uni-trier.de 本地加速

springer.com 本地加速

springer.com LINK 下载加速

sci-hub.st HTML 下载加速

参考文章(24)

Silviu Cucerzan, Large-Scale Named Entity Disambiguation Based on Wikipedia Data empirical methods in natural language processing. pp. 708- 716 ,(2007)

James Martin, Ying Chen, Towards Robust Unsupervised Personal Name Disambiguation empirical methods in natural language processing. pp. 190- 198 ,(2007)

Ralph Grishman, Andrew Eliot Borthwick, A maximum entropy approach to named entity recognition Ph. D. Thesis New York University. ,(1999)

Daniel M. Bikel, Richard Schwartz, Ralph M. Weischedel, An Algorithm that Learns What‘s in a Name Machine Learning. ,vol. 34, pp. 211- 231 ,(1999) , 10.1023/A:1007558221122

Mark Dredze, Tim Finin, Adam Gerber, Delip Rao, Paul McNamee, Entity Disambiguation for Knowledge Base Population international conference on computational linguistics. pp. 277- 285 ,(2010)

Javier Artiles, Satoshi Sekine, Julio Gonzalo, Web people search Proceeding of the 17th international conference on World Wide Web - WWW '08. pp. 1071- 1072 ,(2008) , 10.1145/1367497.1367661

A. K. Jain, M. N. Murty, P. J. Flynn, Data clustering: a review ACM Computing Surveys. ,vol. 31, pp. 264- 323 ,(1999) , 10.1145/331499.331504

Qi Li, Sam Anzaroot, Wen-Pin Lin, Xiang Li, Heng Ji, Joint inference for cross-document information extraction Proceedings of the 20th ACM international conference on Information and knowledge management - CIKM '11. pp. 2225- 2228 ,(2011) , 10.1145/2063576.2063932

Wei Li, Andrew McCallum, Rapid development of Hindi named entity recognition using conditional random fields and feature induction ACM Transactions on Asian Language Information Processing. ,vol. 2, pp. 290- 294 ,(2003) , 10.1145/979872.979879

10.

Kisung Lee, Ling Liu, None, Efficient data partitioning model for heterogeneous graphs in the cloud ieee international conference on high performance computing data and analytics. pp. 46- ,(2013) , 10.1145/2503210.2503302

Large-Scale Chinese Cross-Document Entity Disambiguation and Information Fusion

来源期刊

我的账户

Large-Scale Chinese Cross-Document Entity Disambiguation and Information Fusion

来源期刊

相似文章 0

我的账户