作者: Xiaoge Li , Sugang Ma , Xiaohui Zhou
DOI: 10.1007/978-3-319-10596-3_9
关键词: Pipeline (software) 、 Entity–relationship model 、 Graph (abstract data type) 、 Artificial intelligence 、 Natural language processing 、 Modular design 、 Information extraction 、 Computer science 、 Alias 、 Knowledge base 、 Scale (map)
摘要: Cross-document entity disambiguation is the problem of identifying whether mentions from different documents refer to same or distinct entities and rises in information fusion automated knowledge base construction. In this paper, we describe a Chinese Information Extraction (IE) system based on Hadoop Framework, which involves document-level IE corpus-level IE, pipeline multi-level modular approach Name Entity Recognitions (EDR), relationship extraction fusion. associated with each mention name can be merged into rich profiles for our co-reference alias modular, performed agglomerative hierarchical clustering using Map Reduce. The visualized results centric graph have been demonstrated.