作者: Cheng-Wei Lee , Cheng-Wei Shih , Tzong-Han Tsai , Shih-Hung Wu , Wen-Lian Hsu
DOI: 10.30019/IJCLCLP.200402.0004
关键词:
摘要: This paper presents a Chinese named entity recognizer (NER): Mencius. It aims to address NER problems by combining the advantages of rule-based and machine learning (ML) based systems. Rule-based systems can explicitly encode human comprehension be tuned conveniently, while ML-based are robust, portable inexpensive develop. Our hybrid system incorporates knowledge representation template-matching tool, called InfoMap [Wu et al. 2002], into maximum entropy (ME) framework. Named entities represented in as templates, which serve ME features These edited manually, their weights estimated framework according training data. To understand how word segmentation might influence differences between pure template-based method our method, we configure Mencius using four distinct settings. The F-Measures person names (PER), location (LOC) organization (ORO) best configuration experiment were respectively 94.3%, 77.8% 75.3%. From comparing results obtained these configurations reveals that Systems always perform better performance identifying names. On other hand, they have little difficulty Furthermore, module improves Template-based Systems, but, it has effect on