Mining text documents for thematic hierarchies using self-organizing maps

作者: Hsin-Chang Yang , Chung-Hong Lee

DOI: 10.4018/978-1-59140-051-6.CH008

关键词:

摘要: Recently, many approaches have been devised for mining various kinds of knowledge from texts. One important application text is to identify themes and the semantic relations among these categorization. Traditionally, were arranged in a hierarchical manner achieve effective searching indexing as well easy comprehension human beings. The determination category their structures was mostly done by experts. In this work, we developed an approach automatically generate reveal structure them. We also used generated categorize documents. document collection trained self-organizing map form two feature maps. then analyzed maps obtained structure. Although test corpus contains documents written Chinese, proposed can be applied any language, such transformed into list separated terms.

参考文章(32)
Y. Changwen, Kok F. Lai, K. Rajaraman, Experiments on proximity based chinese text retrieval in TREC 6 text retrieval conference. pp. 559- 566 ,(1997)
Jay M. Ponte, W. Bruce Croft, Text Segmentation by Topic european conference on research and advanced technology for digital libraries. pp. 113- 125 ,(1997) , 10.1007/BFB0026725
Andreas S. Weigend, Erik D. Wiener, Jan O. Pedersen, Exploiting Hierarchy in Text Categorization Information Retrieval. ,vol. 1, pp. 193- 216 ,(1999) , 10.1023/A:1009983522080
Xiangji Huang, Stephen E. Robertson, Okapi Chinese text retrieval experiments at TREC-6 text retrieval conference. pp. 137- 142 ,(1997)
Kamal Nigam, Andrew McCallum, Text Classification by Bootstrapping with Keywords, EM and Shrinkage Unsupervised Learning in Natural Language Processing. ,(1999)
Thomas Hofmann, The Cluster-Abstraction Model: Unsupervised Learning of Topic Hierarchies from Text Data international joint conference on artificial intelligence. pp. 682- 687 ,(1999)
Gerard Salton, Amit Singhal, Automatic Text Theme Generation and the Analysis of Text Structure Cornell University. ,(1994)
Ronen Feldman, Ido Dagan, Haym Hirsh, Mining Text Using Keyword Distributions intelligent information systems. ,vol. 10, pp. 281- 300 ,(1998) , 10.1023/A:1008623632443
Teuvo Kohonen, Self-Organizing Maps ,(1995)