作者: Yuejie Zhang , Xiangyang Xue , Lei Cen , Cheng Jin , Wei Wu
DOI: 10.5591/978-1-57735-516-8/IJCAI11-321
关键词: Fusion 、 Mathematics 、 Machine learning 、 Supervised learning 、 Heuristic 、 Artificial intelligence 、 Term (time)
摘要: In this paper, to support more precise Chinese Out-of-Vocabulary (OOV) term detection and Part-of-Speech (POS) guessing, a unified mechanism is proposed formulated based on the fusion of multiple features supervised learning. Besides all traditional features, new for statistical information global contexts are introduced, as well some constraints heuristic rules, which reveal relationships among OOV candidates. Our experiments corpora from both People's Daily SIGHAN 2005 have achieved consistent results, better than those acquired by pure rule-based or statistics-based models. From experimental results combining our model with monolingual retrieval data sets TREC-9, it found that obvious improvement performance can also be obtained.