USe: A Retargetable Word Segmentation Procedure for Information Retrieval

作者： J. Ponte

DOI:

关键词: Task (project management) 、 Delimiter 、 Search engine indexing 、 Automaton 、 Segmentation 、 Information retrieval 、 Text segmentation 、 Computer science 、 Natural language processing 、 Human–computer information retrieval 、 Visual Word 、 Artificial intelligence

摘要: Many languages, such as Chinese, are written without interword delimiters. For these a segmenter is required pre-processing step for information retrieval systems. We describe USeg, platform word segmentation designed to fulfill the requirments imposed by task. USeg based on an underlying probabalistic automaton which serves simple language model. A description of proposed model(s), implementation issues models and experimental results presented. The experiments show that fairly model can produce reasonable results, do so quickly enough be useful indexing in system re-targeted new languages great deal human effort.

acm.org UNKNOWN 下载加速

参考文章(0)

USe: A Retargetable Word Segmentation Procedure for Information Retrieval

来源期刊

我的账户

USe: A Retargetable Word Segmentation Procedure for Information Retrieval

来源期刊

相似文章 10

我的账户