Updateable PAT-Tree Approach to Chinese Key PhraseExtraction using Mutual Information: A Linguistic Foundation for Knowledge Management

作者: Hsinchun Chen , Thian-Huat Ong

DOI:

关键词:

摘要: There has been renewed research interest in using the statistical approach to extraction of key phrases from Chinese documents because existing approaches do not allow online frequency updates after have extracted. This consequently results inaccurate, partial extraction. In this paper, we present an updateable PAT-tree approach. our experiment, compared with that Lee-Feng Chien showed improvement recall 0.19 0.43 and precision 0.52 0.70. paper also reviews requirements for a data structure facilitates implementation any key-phrase extraction, including PATtree, PAT-array suffix array semi-infinite strings.

参考文章(33)
G. H. Gonnet, R. Baeza-Yates, Handbook of algorithms and data structures: in Pascal and C (2nd ed.) Addison-Wesley Longman Publishing Co., Inc.. ,(1991)
Lee-Feng Chien, PAT-tree-based adaptive keyphrase extraction for intelligent Chinese information retrieval Information Processing and Management. ,vol. 35, pp. 501- 521 ,(1999)
R. Baeza-Yates, G. H. Gonnet, Handbook of algorithms and data structures : in Pascal and C Addison-Wesley. ,(1991)
Gerald Salton, Automatic text processing ,(1988)
Hsiao-Tieh Pu, Lee-Feng Chien, Important Issues on Chinese Information Retrieval International Journal of Computational Linguistics & Chinese Language Processing, Volume 1, Number 1, August 1996. ,vol. 1, pp. 205- 221 ,(1996) , 10.30019/IJCLCLP.199608.0007
Yuh-Min Chen, Chia-Ching Liao, Biren Prasad, A Systematic Approach of Virtual Enterprising Through Knowledge Management Techniques Concurrent Engineering. ,vol. 6, pp. 225- 244 ,(1998) , 10.1177/1063293X9800600305
Penelope Jones, Judith Jordan, None, Knowledge orientations and team effectiveness International Journal of Technology Management. ,vol. 16, pp. 152- ,(1998) , 10.1504/IJTM.1998.002651
Lee-Feng Chien, PAT-tree-based keyword extraction for Chinese information retrieval international acm sigir conference on research and development in information retrieval. ,vol. 31, pp. 50- 58 ,(1997) , 10.1145/258525.258534
Hsinchun Chen, Yi‐Ming Chung, Marshall Ramsey, Christopher C. Yang, A smart itsy bitsy spider for the web Journal of the Association for Information Science and Technology. ,vol. 49, pp. 604- 618 ,(1998) , 10.1002/(SICI)1097-4571(1998)49:7<604::AID-ASI3>3.3.CO;2-X
Zimin Wu, Gwyneth Tseng, ACTS: An automatic Chinese text segmentation system for full text retrieval Journal of the American Society for Information Science. ,vol. 46, pp. 83- 96 ,(1995) , 10.1002/(SICI)1097-4571(199503)46:2<83::AID-ASI2>3.0.CO;2-0